Transfer Learning and Attention-based Conditioning Methods for Natural Language Processing

Μαργατίνα, Αικατερίνη

Εθνικό Μετσόβιο Πολυτεχνείο

Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών

Καλώς ήρθατε στο Άρτεμις

Σκοπός του Άρτεμις είναι η συστηματική αρχειοθέτηση και διαδοση της πνευματικής παραγωγής της Σχολής Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών του Εθνικού Μετσόβιου Πολυτεχνείου, με τη βοήθεια της τεχνολογίας των ψηφιακών βιβλιοθηκών.

Παρακαλώ χρησιμοποιήστε αυτό το αναγνωριστικό για να παραπέμψετε ή να δημιουργήσετε σύνδεσμο προς αυτό το τεκμήριο: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/17295

Τίτλος:	Transfer Learning and Attention-based Conditioning Methods for Natural Language Processing
Συγγραφείς:	Μαργατίνα, Αικατερίνη Ποταμιάνος Αλέξανδρος
Λέξεις κλειδιά:	deep learning natural language processing attention mechanism transfer learning machine learning sentiment analysis emotion recognition language modelling
Ημερομηνία έκδοσης:	5-Ιου-2019
Περίληψη:	In this work, we investigate methods to augment the inductive bias of deep neural models for natural language processing tasks. Our goal is to improve performance of recurrent neural networks in a family of sentiment analysis tasks. Specifically, our research includes; (1) transferring knowledge from pretrained models in order to leverage different domains and tasks, and (2) integrating prior information from human experts to deep neural architectures. First, we propose a method for successfully utilizing a pretrained sentiment analysis classification model to reduce the test error rate on an emotion recognition classification task. Transfer learning from pretrained classifiers exploits the representation learned for one supervised setting with plenty of data, to obtain competitive results on a related task where a smaller dataset is available. We aim to leverage the learned representation of the pretrained sentiment model to tackle the emotion classification task. Next, we utilize pretrained representations from language models to address the same emotion classification task. In this case, the learning algorithm uses information obtained in the unsupervised phase to perform better in the supervised learning stage. Specifically, pretrained word representations captured by language models are useful as they encode contextual information and model syntax and semantics. We propose a three-step transfer learning method that includes pretraining a language model, fine-tuning the weights on the target task and transferring the model to a classifier to leverage these representations. We show an improvement of 10% on the WASSA 2018 emotion recognition dataset baseline. We achieve an F1-score of 70.3%, ranking in the top-3 positions of the shared task. Finally, we experiment with feature-wise conditioning methods to integrate prior knowledge into deep neural networks. We propose the integration of lexicon features into the self-attention mechanism of RNN-based architectures. This form of conditioning on the attention distribution, enforces the contribution of the most salient words for the task at hand. We introduce three methods, namely attentional concatenation, feature-based gating and affine transformation. Experiments on six benchmark datasets show the effectiveness of our methods. Attentional feature-based gating yields consistent performance improvement across tasks. Our approach is implemented as a simple add-on module for RNN-based models with minimal computational overhead and can be adapted to any deep neural architecture. Overall, our work is divided into two main research areas; the first is transfer learning methods of pretrained representations for implicit emotion recognition, while the second is attention-based conditioning methods for external knowledge integration into recurrent neural networks. Both works culminated into research papers, [25] and [83] respectively.
URI:	http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/17295
Εμφανίζεται στις συλλογές:	Διπλωματικές Εργασίες - Theses

Αρχεία σε αυτό το τεκμήριο:

Αρχείο	Περιγραφή	Μέγεθος	Μορφότυπος
Eng_Thesis_Kate.pdf		3.99 MB	Adobe PDF	Εμφάνιση/Άνοιγμα

Δείξε την πλήρη περιγραφή του τεκμηρίου

Όλα τα τεκμήρια του δικτυακού τόπου προστατεύονται από πνευματικά δικαιώματα.