Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/17971
Full metadata record
DC FieldValueLanguage
dc.contributor.authorΠίκουλης, Ιωάννης-
dc.date.accessioned2021-06-30T12:01:40Z-
dc.date.available2021-06-30T12:01:40Z-
dc.date.issued2021-06-24-
dc.identifier.urihttp://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/17971-
dc.description.abstractVisual emotion recognition constitutes a major subject in the interdisciplinary field of Computer Vision which is associated with the process of identifying human emotion on categorical (discrete) and/or dimensional (continuous) level, as it is being depicted in still images or video sequences. A review of related literature reveals that the majority of past efforts in visual emotion recognition have been mostly limited to the analysis of facial expressions, while some studies have either incorporated information relative to body pose or have attempted to perform emotion recognition solely on the basis of body movements and gestures. While some of these approaches perform well in controlled environments, they fail to interpret real-world scenarios where unpredictable social settings can render one or multiple of the aforementioned sources of affective information inaccessible. However, evidence from psychology related studies suggest that visual context, in addition to facial expression and body pose, provides important information to the perception of people’s emotions. In this work, we aim at reinforcing the concept of context-based visual emotion recognition. To this end, we conduct extensive experiments on two newly assembled and challenging databases, i.e. the EMOTions In Context (EMOTIC) and Body Language Dataset (BoLD), tackling both the image-based and video-based versions of the problem. More specifically we: • Extend already successful baseline architectures by incorporating multiple input streams that encode bodily, facial, contextual as well as scene related features, thus enhancing our models’ understanding of visual context and emotion in general. • Directly infuse scene classification scores and attributes as additional features in the emotion recognition process that function in a complementary manner with respect to all other sources of affective information. To the best of our knowledge, our approach is the first to do so. • Exploit categorical emotion label dependencies, that reside within the datasets, through the usage of Graph Convolutional Networks (GCN) and the addition of metric-learning inspired loss that is based on GloVe word embeddings. • Achieve competitive results on EMOTIC and significant improvements over the stateof- the-art techniques with relation to BoLD. A big portion of our contributions [86] was submitted to the 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG), with the authors being Ioannis Pikoulis, Panagiotis Paraskevas Filntisis and Petros Maragos.en_US
dc.languageenen_US
dc.subjectemotion recognitionen_US
dc.subjectdeep neural networksen_US
dc.subjectbodyen_US
dc.subjectfaceen_US
dc.subjectposeen_US
dc.subjectvisual-semantic contexten_US
dc.subjectCNNen_US
dc.subjectGCNen_US
dc.subjectTSNen_US
dc.subjectST-GCNen_US
dc.subjectnetwork ensembleen_US
dc.titleContext-Based Visual Emotion Recognition Using Deep Neural Networksen_US
dc.description.pages164en_US
dc.contributor.supervisorΜαραγκός Πέτροςen_US
dc.departmentΤομέας Σημάτων, Ελέγχου και Ρομποτικήςen_US
Appears in Collections:Διπλωματικές Εργασίες - Theses

Files in This Item:
File Description SizeFormat 
el15198_thesis_final.pdfΔιπλωματική Εργασία - Κύριο Έγγραφο22.94 MBAdobe PDFView/Open


Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.