Addressing the gaps in human intelligence assessment and GenAI benchmarking: a comparative analysis of evaluation frameworks and practical implementation with SparkBeyond
Brauner, Joshua (A.A. 2024/2025) Addressing the gaps in human intelligence assessment and GenAI benchmarking: a comparative analysis of evaluation frameworks and practical implementation with SparkBeyond. Tesi di Laurea in AI frontiers: large language models, Luiss Guido Carli, relatore Simone Di Somma, pp. 102. [Master's Degree Thesis]
|
PDF (Full text)
Download (4MB) | Preview |
Abstract/Index
Fundamentals of artificial intelligence. The evaluation gap. Defining intelligence. Intelligence as a concept. Definition of artificial intelligence. The problem in emulating human intelligence. Measuring intelligence. Measuring human intelligence. Measuring artificial intelligence. Dynamic approaches for evaluation. A dynamic benchmarking evaluation framework for LLMs and the impact of complexity. Theoretical foundation behind complexity: the illusion of thinking. SparkBeyond case study: development of a strong dynamic benchmarking framework and agents testing. Learnings, what to expect next.
References
Bibliografia: pp. 98-102.
| Thesis Type: | Master's Degree Thesis |
|---|---|
| Institution: | Luiss Guido Carli |
| Degree Program: | Master's Degree Programs > Master's Degree Program in Data Science e Management (LM-91) |
| Chair: | AI frontiers: large language models |
| Thesis Supervisor: | Di Somma, Simone |
| Thesis Co-Supervisor: | Italiano, Giuseppe Francesco |
| Academic Year: | 2024/2025 |
| Session: | Autumn |
| Deposited by: | Alessandro Perfetti |
| Date Deposited: | 24 Feb 2026 14:01 |
| Last Modified: | 24 Feb 2026 14:01 |
| URI: | https://tesi.luiss.it/id/eprint/44953 |
Downloads
Downloads per month over past year
Repository Staff Only
![]() |
View Item |



