Algorithm Design for Reliable Machine Learning

Kalavasis, Alvertos

Εθνικό Μετσόβιο Πολυτεχνείο

Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών

Καλώς ήρθατε στο Άρτεμις

Σκοπός του Άρτεμις είναι η συστηματική αρχειοθέτηση και διαδοση της πνευματικής παραγωγής της Σχολής Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών του Εθνικού Μετσόβιου Πολυτεχνείου, με τη βοήθεια της τεχνολογίας των ψηφιακών βιβλιοθηκών.

Παρακαλώ χρησιμοποιήστε αυτό το αναγνωριστικό για να παραπέμψετε ή να δημιουργήσετε σύνδεσμο προς αυτό το τεκμήριο: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/18709

Τίτλος:	Algorithm Design for Reliable Machine Learning
Συγγραφείς:	Kalavasis, Alvertos Φωτάκης Δημήτριος
Λέξεις κλειδιά:	Reliable Machine Learning Theoretical Machine Learning
Ημερομηνία έκδοσης:	14-Ιου-2023
Περίληψη:	In this thesis we theoretically study questions in the area of Reliable Machine Learning in order to design algorithms that are robust to bias and noise (Robust Machine Learning) and satisfy societal desiderata such as privacy and reproducibility (Responsible Machine Learning). In the area of Robust Machine Learning, we design computationally efficient algorithms for problems in the fields of Truncated Statistics, Censored Statistics and Robust Statistics. In particular, we provide the first efficient methods for truncated distribution learning in discrete settings and perfect data sampling from truncated data. Next, we study the fundamental problem of learning from partial/coarse labels. Our main algorithmic result is that essentially any problem learnable from fine grained labels can also be learned efficiently when the coarse data are sufficiently informative. We obtain our result through a generic reduction for answering Statistical Queries (SQ) over fine grained labels given only coarse labels. We also study the central problem in Censored Statistics of Gaussian mean estimation from coarse data. Finally, we consider the problem of learning linear sorting functions in the presence of bounded noise, a problem that generalizes the problem of learning halfspaces with Massart noise. In the area of Responsible Machine Learning, we study the notion of replicability as an algorithmic property and introduce the notion of replicable policies in the context of stochastic bandits, one of the canonical problems in interactive learning. We show that not only do replicable policies exist, but also they achieve almost the same optimal (non-replicable) regret bounds in terms of the time horizon. Lastly, we establish information-theoretic equivalences between notions of algorithmic stability such as replicability and approximate differential privacy. We do so by focusing on the following question: When two different parties use the same learning rule on their own data, how can we test whether the distributions of the two outcomes are similar? We study the similarity of outcomes of learning rules through the lens of the Total Variation (TV) distance of distributions. We say that a learning rule is TV indistinguishable if the expected TV distance between the posterior distributions of its outputs, executed on two training data sets drawn independently from the same distribution, is small. We first investigate the learnability of hypothesis classes using TV indistinguishable learners. Our main results are information-theoretic equivalences between TV indistinguishability and existing algorithmic stability notions such as replicability and approximate differential privacy.
URI:	http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/18709
Εμφανίζεται στις συλλογές:	Διδακτορικές Διατριβές - Ph.D. Theses

Αρχεία σε αυτό το τεκμήριο:

Αρχείο	Περιγραφή	Μέγεθος	Μορφότυπος
phd_thesis.pdf		2.43 MB	Adobe PDF	Εμφάνιση/Άνοιγμα

Δείξε την πλήρη περιγραφή του τεκμηρίου

Όλα τα τεκμήρια του δικτυακού τόπου προστατεύονται από πνευματικά δικαιώματα.