Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/18337
Title: A Multi-Task BERT Model for Schema-Guided Dialogue State Tracking
Authors: Καπελώνης, Ελευθέριος
Ποταμιάνος Αλέξανδρος
Keywords: machine learning
deep learning
natural language processing
bert
dialogue systems
dialogue state tracking
multi-task learning
Issue Date: 14-Jun-2022
Abstract: Dialogue systems often employ a Dialogue State Tracking (DST) component to successfully complete conversations. DST aims to track the user goal over the course of a dialogue and it is a particularly challenging task in multi-domain scenarios. Schema-guided DST is a new approach, where the schema, i.e. a list of the supported slots and intents along with natural language descriptions, is provided for each dialogue service. Recent state-of-the-art DST implementations rely on schemata of diverse services to improve model robustness and handle zero-shot generalization to new domains, however such methods typically require multiple large scale transformer models and long input sequences to perform well. In this diploma thesis, we first introduce the basics of machine learning, deep learning, natural language processing and dialogue systems focusing on DST. We then propose a single multi-task BERT-based model that jointly solves the three DST tasks of intent prediction, requested slot prediction and slot filling. Moreover, we propose an efficient and parsimonious encoding of the dialogue history and service schemata that is shown to further improve performance. We only encode the last two utterances, a compact schema representation and the previously predicted dialogue state. The preceding system utterance is represented as its underlying system actions which significantly benefits accuracy. For the slot filling task we additionally incorporate slot carryover mechanisms that search previous dialogue utterances and states to retrieve values when necessary. A number of classification heads which take as input various parts of the BERT-encoded sequence are jointly trained to perform the tasks. Evaluation on the SGD dataset shows that our approach outperforms the baseline SGP-DST by a large margin and performs well compared to the state-of-the-art, while being significantly more computationally efficient. Extensive ablation studies are performed to examine the contributing factors to the success of our model.
URI: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/18337
Appears in Collections:Διπλωματικές Εργασίες - Theses

Files in This Item:
File Description SizeFormat 
thesisKapelonis.pdf2.69 MBAdobe PDFView/Open


Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.