A Multi-Task BERT Model for Schema-Guided Dialogue State Tracking

Καπελώνης, Ελευθέριος

Εθνικό Μετσόβιο Πολυτεχνείο

Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών

Καλώς ήρθατε στο Άρτεμις

Σκοπός του Άρτεμις είναι η συστηματική αρχειοθέτηση και διαδοση της πνευματικής παραγωγής της Σχολής Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών του Εθνικού Μετσόβιου Πολυτεχνείου, με τη βοήθεια της τεχνολογίας των ψηφιακών βιβλιοθηκών.

Παρακαλώ χρησιμοποιήστε αυτό το αναγνωριστικό για να παραπέμψετε ή να δημιουργήσετε σύνδεσμο προς αυτό το τεκμήριο: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/18337

Τίτλος:	A Multi-Task BERT Model for Schema-Guided Dialogue State Tracking
Συγγραφείς:	Καπελώνης, Ελευθέριος Ποταμιάνος Αλέξανδρος
Λέξεις κλειδιά:	machine learning deep learning natural language processing bert dialogue systems dialogue state tracking multi-task learning
Ημερομηνία έκδοσης:	14-Ιου-2022
Περίληψη:	Dialogue systems often employ a Dialogue State Tracking (DST) component to successfully complete conversations. DST aims to track the user goal over the course of a dialogue and it is a particularly challenging task in multi-domain scenarios. Schema-guided DST is a new approach, where the schema, i.e. a list of the supported slots and intents along with natural language descriptions, is provided for each dialogue service. Recent state-of-the-art DST implementations rely on schemata of diverse services to improve model robustness and handle zero-shot generalization to new domains, however such methods typically require multiple large scale transformer models and long input sequences to perform well. In this diploma thesis, we first introduce the basics of machine learning, deep learning, natural language processing and dialogue systems focusing on DST. We then propose a single multi-task BERT-based model that jointly solves the three DST tasks of intent prediction, requested slot prediction and slot filling. Moreover, we propose an efficient and parsimonious encoding of the dialogue history and service schemata that is shown to further improve performance. We only encode the last two utterances, a compact schema representation and the previously predicted dialogue state. The preceding system utterance is represented as its underlying system actions which significantly benefits accuracy. For the slot filling task we additionally incorporate slot carryover mechanisms that search previous dialogue utterances and states to retrieve values when necessary. A number of classification heads which take as input various parts of the BERT-encoded sequence are jointly trained to perform the tasks. Evaluation on the SGD dataset shows that our approach outperforms the baseline SGP-DST by a large margin and performs well compared to the state-of-the-art, while being significantly more computationally efficient. Extensive ablation studies are performed to examine the contributing factors to the success of our model.
URI:	http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/18337
Εμφανίζεται στις συλλογές:	Διπλωματικές Εργασίες - Theses

Αρχεία σε αυτό το τεκμήριο:

Αρχείο	Περιγραφή	Μέγεθος	Μορφότυπος
thesisKapelonis.pdf		2.69 MB	Adobe PDF	Εμφάνιση/Άνοιγμα

Δείξε την πλήρη περιγραφή του τεκμηρίου

Όλα τα τεκμήρια του δικτυακού τόπου προστατεύονται από πνευματικά δικαιώματα.