Please use this identifier to cite or link to this item:
Title: Graph Neural Networks with External Knowledge for Visual Dialog
Authors: Καλογερόπουλος, Ιωάννης
Ποταμιάνος Αλέξανδρος
Keywords: Deep Learning
Natural Language Processing
Graph Neural Networks
Visual Dialog
Issue Date: 21-Jul-2022
Abstract: In this Diploma Thesis, we study the effectiveness of Graph Neural Networks on the task of Visual Dialog. Towards achieving interesting architectures and great results, we experiment on two axes. Firstly, we study various Fusion Methods. In a wide range of Machine Learning problems, we encounter the problem of combining different types of information extracted from various sources. The fusion method used to combine the different modalities is a fundamental design choice of the model and a crucial factor towards the achievement of better results. We experimented on a few sets of different methods and selected the best one for our model. Subsequently, we introduce External Knowledge. The task of Visual Dialog doesn’t require by itself the use of external knowledge. Nevertheless, introducing external knowledge has been proved effective in many tasks of Machine Learning and especially in the field of Natural Language Processing. As a result it has drawn a lot of research interest through the last years and has been applied to a wide variety of similar tasks. Hence, we attempt to introduce external knowledge to our approach and experiment with a few ways of exploiting the extra information. In our experiments we adapt the fusion methods of our baseline and utilize them for fusing the three modalities of our model. We further experiment on the encoding of the External Knowledge. Specifically, we examine the use of one or multiple types of relations of the knowledge graph as well as different methods of aggregating the external information. By conducting a number of experiments, we are able to draw interesting conclusions about the impact of introducing External Knowledge to our model. Specifically, by surpassing the implemented baseline using two different methods, we conclude that it is beneficial for the overall performance. Moreover, we demonstrate this impact by using two types of decoders. The consistency of the results using both decoders highlights the impact of the different encoders. Finally, from our results, we come to the conclusion that the simplest models with less parameters were able to perform better towards encoding the External Knowledge Graph.
Appears in Collections:Διπλωματικές Εργασίες - Theses

Files in This Item:
File Description SizeFormat 
thesis_kalogeropoulos.pdf10.06 MBAdobe PDFView/Open

Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.