Παρακαλώ χρησιμοποιήστε αυτό το αναγνωριστικό για να παραπέμψετε ή να δημιουργήσετε σύνδεσμο προς αυτό το τεκμήριο:
http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19396
Τίτλος: | Domain Generalization in Robust Vision Transformers for Semantic Segmentation in Autonomous Driving |
Συγγραφείς: | Τζόκας, Γιώργος Βουλόδημος Αθανάσιος |
Λέξεις κλειδιά: | Νευρωνικά ∆ίκτυα Βαθιά Μάθηση Κατάτμηση Εικόνας Γενίκευση Πεδίου Αυτόνομα Οχήματα Σημασιολογική Τμηματοποίηση Πραγματικού Χρόνου |
Ημερομηνία έκδοσης: | 25-Οκτ-2024 |
Περίληψη: | Recent advancements in artificial intelligence have led to the widespread use of deep learning across various applications. Computer vision, in particular, has greatly benefited from these developments, with highly efficient models now employed in real-time scenarios. One notable application is semantic segmentation for autonomous driving, which enables self-driving vehicles to achieve a detailed understanding of their surroundings, allowing them to make informed decisions in real-time. For these applications, it is crucial for models to maintain high accuracy across diverse environmental conditions while operating in real-time. The goal of this thesis was to develop a model that harnesses the robustness of transformer encoders while enhancing the model's efficiency compared to state-of-the-art generalization architectures. To demonstrate that the model maintains accuracy, we conducted a generalization experiment comparing the model agianst robust models and a real-time architecture. Additionally, we performed two experiments in different knowledge domains to show that the capabilities of these models extend beyond autonomous driving. The experiments showed that although transformers are robust and unaffected by field shifts, they are far from being a viable solution in real-time operations. In our case, by using an efficient decoder we managed to accelerate the speed of inference without sacrificing accuracy. However, this small reduction in extraction time is not enough to achieve real-time segmentation or speeds comparable to those of convolutional models. In conclusion, efforts should be made to reduce the computational burden caused by transformer models, as they seem to be the main source of the peak in inference times compared to convolutional architectures. |
URI: | http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19396 |
Εμφανίζεται στις συλλογές: | Διπλωματικές Εργασίες - Theses |
Αρχεία σε αυτό το τεκμήριο:
Αρχείο | Περιγραφή | Μέγεθος | Μορφότυπος | |
---|---|---|---|---|
thesis_Tzokas_Georgios.pdf | 9.61 MB | Adobe PDF | Εμφάνιση/Άνοιγμα |
Όλα τα τεκμήρια του δικτυακού τόπου προστατεύονται από πνευματικά δικαιώματα.