Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19396
Full metadata record
DC FieldValueLanguage
dc.contributor.authorΤζόκας, Γιώργος-
dc.date.accessioned2024-11-08T07:55:40Z-
dc.date.available2024-11-08T07:55:40Z-
dc.date.issued2024-10-25-
dc.identifier.urihttp://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19396-
dc.description.abstractRecent advancements in artificial intelligence have led to the widespread use of deep learning across various applications. Computer vision, in particular, has greatly benefited from these developments, with highly efficient models now employed in real-time scenarios. One notable application is semantic segmentation for autonomous driving, which enables self-driving vehicles to achieve a detailed understanding of their surroundings, allowing them to make informed decisions in real-time. For these applications, it is crucial for models to maintain high accuracy across diverse environmental conditions while operating in real-time. The goal of this thesis was to develop a model that harnesses the robustness of transformer encoders while enhancing the model's efficiency compared to state-of-the-art generalization architectures. To demonstrate that the model maintains accuracy, we conducted a generalization experiment comparing the model agianst robust models and a real-time architecture. Additionally, we performed two experiments in different knowledge domains to show that the capabilities of these models extend beyond autonomous driving. The experiments showed that although transformers are robust and unaffected by field shifts, they are far from being a viable solution in real-time operations. In our case, by using an efficient decoder we managed to accelerate the speed of inference without sacrificing accuracy. However, this small reduction in extraction time is not enough to achieve real-time segmentation or speeds comparable to those of convolutional models. In conclusion, efforts should be made to reduce the computational burden caused by transformer models, as they seem to be the main source of the peak in inference times compared to convolutional architectures.en_US
dc.languageenen_US
dc.subjectΝευρωνικά ∆ίκτυαen_US
dc.subjectΒαθιά Μάθησηen_US
dc.subjectΚατάτμηση Εικόναςen_US
dc.subjectΓενίκευση Πεδίουen_US
dc.subjectΑυτόνομα Οχήματαen_US
dc.subjectΣημασιολογική Τμηματοποίηση Πραγματικού Χρόνουen_US
dc.titleDomain Generalization in Robust Vision Transformers for Semantic Segmentation in Autonomous Drivingen_US
dc.description.pages91en_US
dc.contributor.supervisorΒουλόδημος Αθανάσιοςen_US
dc.departmentΤομέας Ηλεκτρομαγνητικών Εφαρμογών Ηλεκτροοπτικής και Ηλεκτρονικών Υλικώνen_US
Appears in Collections:Διπλωματικές Εργασίες - Theses

Files in This Item:
File Description SizeFormat 
thesis_Tzokas_Georgios.pdf9.61 MBAdobe PDFView/Open


Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.