A Data-Driven Approach to the Approximate Nearest Neighbor Problem

Καλαβάς, Ανδρέας

Εθνικό Μετσόβιο Πολυτεχνείο

Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών

Καλώς ήρθατε στο Άρτεμις

Σκοπός του Άρτεμις είναι η συστηματική αρχειοθέτηση και διαδοση της πνευματικής παραγωγής της Σχολής Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών του Εθνικού Μετσόβιου Πολυτεχνείου, με τη βοήθεια της τεχνολογίας των ψηφιακών βιβλιοθηκών.

Παρακαλώ χρησιμοποιήστε αυτό το αναγνωριστικό για να παραπέμψετε ή να δημιουργήσετε σύνδεσμο προς αυτό το τεκμήριο: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19219

Τίτλος:	A Data-Driven Approach to the Approximate Nearest Neighbor Problem
Συγγραφείς:	Καλαβάς, Ανδρέας Φωτάκης Δημήτριος
Λέξεις κλειδιά:	optimization approximation nearest neighbor data structures data-driven algorithms computational geometry
Ημερομηνία έκδοσης:	9-Ιου-2024
Περίληψη:	The nearest neighbor search (NNS) problem and its variants have captivated scientists for the past fifty years. This problem is prevalent in applications such as data compression, data mining, and machine learning. Although numerous solutions have been proposed, few offer theoretical guarantees while simultaneously optimizing the structure for the input data. This challenge arises because adapting the structure for a specific dataset can expose vulnerabilities to adversarial queries, leading to suboptimal performance. In this thesis, we propose a new model to solve the approximate near neighbor problem (which is the decision version of the nearest neighbor problem), aiming to balance theoretical guarantees with dataset adaptability. Our approach involves storing the input point set in a binary tree structure, optimized for performance on a fixed dataset and query distribution. Queries are processed by traversing from the root to one or more leaves. The decision to follow one or both child nodes is determined by separators located at the vertices. Additionally, we present methods for identifying those separators optimally. The core idea of our approach is to extract useful information from the point set to enhance our structure, but to halt this extraction when it becomes potentially harmful. When this happens, we transition to an existing technique that offers theoretical guarantees. This strategy allows us to leverage the efficiency of our model while avoiding elements that could degrade performance. Thus, our structure remains data-driven while maintaining theoretical guarantees. Finally, we conduct experiments to demonstrate our algorithm’s adaptability to a dataset while preserving its theoretical guarantees. Specifically, we assess our model on the MNIST dataset, by performing queries on model instances built on different sized samples. We then compare our results with those of linear search.
URI:	http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19219
Εμφανίζεται στις συλλογές:	Διπλωματικές Εργασίες - Theses

Αρχεία σε αυτό το τεκμήριο:

Αρχείο	Περιγραφή	Μέγεθος	Μορφότυπος
Διπλωματική_Ανδρέας_Καλαβάς.pdf		1.36 MB	Adobe PDF	Εμφάνιση/Άνοιγμα

Δείξε την πλήρη περιγραφή του τεκμηρίου

Όλα τα τεκμήρια του δικτυακού τόπου προστατεύονται από πνευματικά δικαιώματα.