Παρακαλώ χρησιμοποιήστε αυτό το αναγνωριστικό για να παραπέμψετε ή να δημιουργήσετε σύνδεσμο προς αυτό το τεκμήριο: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19886
Τίτλος: Efficient Incomplete Multimodal-Diffused Emotion Recognition
Συγγραφείς: Ασπρογέρακας, Ιωάννης
Ποταμιάνος Αλέξανδρος
Λέξεις κλειδιά: Diffusion Models
Multimodal Emotion Recognition
Stochastic Differential Equations
Multimodal Deep Learning
Deep Generative Modeling
Ημερομηνία έκδοσης: 24-Οκτ-2025
Περίληψη: Multimodal Emotion Recognition (MER) aims to model human affect by integrating complementary signals from language, vision, and audio. While deep learning methods have achieved impressive results through cross-modal fusion, most assume complete modality availability during training and inference, a condition rarely met in real world deployments where occlusions, noise, or sensor failures frequently cause missing modalities. Addressing this problem requires robust imputation strategies that can recover missing signals without sacrificing efficiency. In this work, we explore the design space of diffusion models for missing modality imputation, building upon and extending the IMDER framework. We propose a decoupled two-stage training scheme where modality-specific diffusion models are pre-trained independently and then integrated into the MER pipeline. This design avoids the instability of end-to-end IMDER training, where untrained diffusion models initially degrade classifier performance. In addition, we systematically compare stochastic differential equation (SDE) formulations, specifically Variance Preserving (VP) and Variance Exploding (VE) processes, evaluate alternative conditioning mechanisms with transformerbased backbones, and finally investigate multiple sampling strategies to balance efficiency and accuracy. Extensive experiments on CMU-MOSI and CMU-MOSEI demonstrate consistent improvements across both fixed and random missing protocols. Our quality-focused configuration achieves superior accuracy, with up to +2% F1 and +1.5% ACC2 gains over IMDER, while delivering 5× faster inference. Meanwhile, our speed-optimized configuration maintains competitive performance, +1% ACC2, +0.5% F1, but achieves remarkable efficiency with 15× faster inference, making it competitive for real-time MER applications.
URI: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19886
Εμφανίζεται στις συλλογές:Διπλωματικές Εργασίες - Theses

Αρχεία σε αυτό το τεκμήριο:
Αρχείο Περιγραφή ΜέγεθοςΜορφότυπος 
ioannisasprogerakas_thesis.pdf28.44 MBAdobe PDFΕμφάνιση/Άνοιγμα


Όλα τα τεκμήρια του δικτυακού τόπου προστατεύονται από πνευματικά δικαιώματα.