Automated jailbreak generation for large language models: a RAG-enhanced approach and vulnerability assessment

Cervelli, Mattia (A.A. 2024/2025) Automated jailbreak generation for large language models: a RAG-enhanced approach and vulnerability assessment. Tesi di Laurea in Algorithmis, Luiss Guido Carli, relatore Irene Finocchi, pp. 51. [Bachelor's Degree Thesis]

[img]
Preview
PDF (Full text)
Download (552kB) | Preview

Abstract/Index

The vulnerability landscape of large language models. The rise of large language models. The imperative of AI safety and alignment. The "black box" challenge and the quest for LLM interpretability. Emergent behaviors and vulnerabilities: insights from LLM internals. Jailbreaking techniques and automated generation. Leveraging retrieval-augmented generation for enhanced attack prompting. An automated jailbreak generation and evaluation pipeline. System architecture overview. Data collection and preparation. Embedding and retrieval augmentation. RAG-enhanced attack generation. Evaluation framework. Experimental setup. Empirical results and system performance. Quantitative results and analysis. Qualitative analysis and observations. Discussion.

References

Bibliografia: pp. 42-44.

Thesis Type: Bachelor's Degree Thesis
Institution: Luiss Guido Carli
Degree Program: Bachelor's Degree Programs > Bachelor's Degree Program in Management and Computer Science, English language (L-18)
Chair: Algorithmis
Thesis Supervisor: Finocchi, Irene
Academic Year: 2024/2025
Session: Summer
Deposited by: Alessandro Perfetti
Date Deposited: 26 Nov 2025 15:45
Last Modified: 26 Nov 2025 15:45
URI: https://tesi.luiss.it/id/eprint/44097

Downloads

Downloads per month over past year

Repository Staff Only

View Item View Item