A structural and statistical approach to code similarity detection
Austeri, Alessandro Maria (A.A. 2024/2025) A structural and statistical approach to code similarity detection. Tesi di Laurea in Introduction to computer programming, Luiss Guido Carli, relatore Alessio Martino, pp. 44. [Bachelor's Degree Thesis]
|
PDF (Full text)
Download (1MB) | Preview |
Abstract/Index
Background and motivation. Problem statement. Research questions and objectives. Proposed approach and contributions. Overview of code similarity and clone detection. Structural approaches. Statistical and NLP-inspired methods. Hybrid methods. Evaluation metrics and benchmarks. Structural-statistical pipeline. Motivations and high‐level design. File gathering and preprocessing. Parsing to abstract syntax trees. Identifier anonymization. Extracting structural features: root-to-leaf paths. Constructing the bag-of-paths. TF–IDF vectorization. Cosine similarity computation. Downstream analyses: clustering, thresholding and visualization. Parameter sensitivity and tuning. Performance.
References
Bibliografia: pp. 36-38.
| Thesis Type: | Bachelor's Degree Thesis |
|---|---|
| Institution: | Luiss Guido Carli |
| Degree Program: | Bachelor's Degree Programs > Bachelor's Degree Program in Management and Computer Science, English language (L-18) |
| Chair: | Introduction to computer programming |
| Thesis Supervisor: | Martino, Alessio |
| Academic Year: | 2024/2025 |
| Session: | Summer |
| Deposited by: | Alessandro Perfetti |
| Date Deposited: | 13 Nov 2025 14:32 |
| Last Modified: | 13 Nov 2025 14:32 |
| URI: | https://tesi.luiss.it/id/eprint/43834 |
Downloads
Downloads per month over past year
Repository Staff Only
![]() |
View Item |



