Παρακαλώ χρησιμοποιήστε αυτό το αναγνωριστικό για να παραπέμψετε ή να δημιουργήσετε σύνδεσμο προς αυτό το τεκμήριο: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19726
Τίτλος: Early Exit techniques for Auto Compressing Neural Networks
Συγγραφείς: Τσέλιγκας, Γεώργιος-Στυλιανός
Ποταμιάνος Αλέξανδρος
Λέξεις κλειδιά: Dynamic Networks
Early Exit
Auto Compressor Networks
Ημερομηνία έκδοσης: 1-Ιου-2025
Περίληψη: Contemporary neural networks have achieved state of the art performance in vision and language tasks by growing in scale. Their enlarging scale combined with their static, one size fits all inference, has lead to a line of research on dynamic neural networks, that can adapt the size and/or structure of their computation, on a per sample basis. This way, full network compute can be allocated on hard, non paradigmatic examples, while easy, paradigmatic ones can utilize less network resources. One of the most natural ways to make a network dynamic, is by implementing early exit methods, which accelerate inference by performing a performance-speedup tradeoff. We apply the idea of early exiting and build on top of previous works on Auto Compres- sor Networks (ACNs). ACNs remove residual connections and replace them with so called long connections, that directly connect each intermediate layer, to the network output. This direct connectivity allows ACNs to compress information on the earlier layers, and makes them good candidates for early exiting methodologies. In this thesis we implement a variety of early exit techniques on ACNs. We try ap- proaches based on intermediate layer logits, intermediate layer embedding distances and on trainable early exit decision heads, and evaluate them on image and language tasks. We achieve great inference speedups, with minimal (if any) performance degradation, compared to full network performance. We compare our early exit results on BERT, with popular techniques from the literature, and showcase the ability of early exit on ACNs to achieve a much better performance-speedup tradeoff. Specifically, our methods achieve speedups of 3-4x, in contrast to 1.5-2x found in the literature, and performance is com- parable or better.
URI: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19726
Εμφανίζεται στις συλλογές:Διπλωματικές Εργασίες - Theses

Αρχεία σε αυτό το τεκμήριο:
Αρχείο Περιγραφή ΜέγεθοςΜορφότυπος 
Early_exit_and_speculative_decoding_techniques_for_Auto_Compressing_Neural_Networks-5.pdf2.38 MBAdobe PDFΕμφάνιση/Άνοιγμα


Όλα τα τεκμήρια του δικτυακού τόπου προστατεύονται από πνευματικά δικαιώματα.