Addressing the gaps in human intelligence assessment and GenAI benchmarking: a comparative analysis of evaluation frameworks and practical implementation with SparkBeyond

Brauner, Joshua (A.A. 2024/2025) Addressing the gaps in human intelligence assessment and GenAI benchmarking: a comparative analysis of evaluation frameworks and practical implementation with SparkBeyond. Tesi di Laurea in AI frontiers: large language models, Luiss Guido Carli, relatore Simone Di Somma, pp. 102. [Master's Degree Thesis]

Preview

PDF (Full text)
Download (4MB) | Preview

Abstract/Index

Fundamentals of artificial intelligence. The evaluation gap. Defining intelligence. Intelligence as a concept. Definition of artificial intelligence. The problem in emulating human intelligence. Measuring intelligence. Measuring human intelligence. Measuring artificial intelligence. Dynamic approaches for evaluation. A dynamic benchmarking evaluation framework for LLMs and the impact of complexity. Theoretical foundation behind complexity: the illusion of thinking. SparkBeyond case study: development of a strong dynamic benchmarking framework and agents testing. Learnings, what to expect next.

References

Bibliografia: pp. 98-102.

Thesis Type:	Master's Degree Thesis
Institution:	Luiss Guido Carli
Degree Program:	Master's Degree Programs > Master's Degree Program in Data Science e Management (LM-91)
Chair:	AI frontiers: large language models
Thesis Supervisor:	Di Somma, Simone
Thesis Co-Supervisor:	Italiano, Giuseppe Francesco
Academic Year:	2024/2025
Session:	Autumn
Deposited by:	Alessandro Perfetti
Date Deposited:	24 Feb 2026 14:01
Last Modified:	24 Feb 2026 14:01
URI:	https://tesi.luiss.it/id/eprint/44953

Downloads

Downloads per month over past year

Repository Staff Only

View Item

Luiss Main Page

Biblioteca Luiss Guido Carli

LuissThesis