Please use this identifier to cite or link to this item:
http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19396
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Τζόκας, Γιώργος | - |
dc.date.accessioned | 2024-11-08T07:55:40Z | - |
dc.date.available | 2024-11-08T07:55:40Z | - |
dc.date.issued | 2024-10-25 | - |
dc.identifier.uri | http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19396 | - |
dc.description.abstract | Recent advancements in artificial intelligence have led to the widespread use of deep learning across various applications. Computer vision, in particular, has greatly benefited from these developments, with highly efficient models now employed in real-time scenarios. One notable application is semantic segmentation for autonomous driving, which enables self-driving vehicles to achieve a detailed understanding of their surroundings, allowing them to make informed decisions in real-time. For these applications, it is crucial for models to maintain high accuracy across diverse environmental conditions while operating in real-time. The goal of this thesis was to develop a model that harnesses the robustness of transformer encoders while enhancing the model's efficiency compared to state-of-the-art generalization architectures. To demonstrate that the model maintains accuracy, we conducted a generalization experiment comparing the model agianst robust models and a real-time architecture. Additionally, we performed two experiments in different knowledge domains to show that the capabilities of these models extend beyond autonomous driving. The experiments showed that although transformers are robust and unaffected by field shifts, they are far from being a viable solution in real-time operations. In our case, by using an efficient decoder we managed to accelerate the speed of inference without sacrificing accuracy. However, this small reduction in extraction time is not enough to achieve real-time segmentation or speeds comparable to those of convolutional models. In conclusion, efforts should be made to reduce the computational burden caused by transformer models, as they seem to be the main source of the peak in inference times compared to convolutional architectures. | en_US |
dc.language | en | en_US |
dc.subject | Νευρωνικά ∆ίκτυα | en_US |
dc.subject | Βαθιά Μάθηση | en_US |
dc.subject | Κατάτμηση Εικόνας | en_US |
dc.subject | Γενίκευση Πεδίου | en_US |
dc.subject | Αυτόνομα Οχήματα | en_US |
dc.subject | Σημασιολογική Τμηματοποίηση Πραγματικού Χρόνου | en_US |
dc.title | Domain Generalization in Robust Vision Transformers for Semantic Segmentation in Autonomous Driving | en_US |
dc.description.pages | 91 | en_US |
dc.contributor.supervisor | Βουλόδημος Αθανάσιος | en_US |
dc.department | Τομέας Ηλεκτρομαγνητικών Εφαρμογών Ηλεκτροοπτικής και Ηλεκτρονικών Υλικών | en_US |
Appears in Collections: | Διπλωματικές Εργασίες - Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
thesis_Tzokas_Georgios.pdf | 9.61 MB | Adobe PDF | View/Open |
Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.