Please use this identifier to cite or link to this item:
http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/17783
Title: | Unsupervised Domain Adaptation for Natural Language Processing |
Authors: | Καρούζος, Κωνσταντίνος Ποταμιάνος Αλέξανδρος |
Keywords: | Natural Language Processing Unsupervised Domain Adaptation Domain Adaptation Sentiment Analysis Language Modeling Unsupervised Learning Multitask Learning Machine Learning |
Issue Date: | 2-Nov-2020 |
Abstract: | The purpose of this diploma dissertation is to study unsupervised domain adaptation for natural language processing applications and specifically for the problem of sentiment analysis. In the domain adaptation problem there is data coming from two distributions, one source domain and one target domain, while labels are only available for the source domain. The aim is learning, by using data from both domains, a model with good generalization on examples belonging to the target domain. In this dissertation we first study the theoretical background of machine learning, at the level of architectural models, training algorithms and learning techniques. Then we cover the background of developments in the subject of natural language processing, making a reference to word vectors, language models and finally to pretrained language models and BERT (Bidirectional Encoder Representations from Transformers). To solve the domain adaptation problem, the literature has proposed a variety of approaches. These are divided into three main categories, those that seek to first learn the common features (pivots) between domains, those that develop models following domain adversarial training and finally the category of data-based approaches which usually seek either to learn pseudo-label for the target domain or the use of pretrained language models. In the present work we propose a new approach to achieve domain adaptation, based on BERT. It consists of two steps. The first step is the continuation of pretraining through masked language modeling on the data derived from the target domain. On a final fine-tuning step we learn the task on source labeled data, while we keep an auxiliary masked language modeling objective on unlabeled target data. The experimental part of this work includes a set of comparative experiments between the proposed method, and a set of previous methods and baselines. The experiments are conducted on the multi domain (books, movies, electronics, kitchenware) sentiment analysis Amazon reviews dataset. The results of the above experiments show a significant improvement in the accuracy of the proposed method compared to the previous state-of-the-art. The work also includes an analysis of the results and visualization of the features that are extracted and used for classification in each case. Finally, we discuss the limitations of the dominant approach of domain adversarial training, based on the the relevant learning theory from different domains and our experimental observations. |
URI: | http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/17783 |
Appears in Collections: | Διπλωματικές Εργασίες - Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
ckarouzos_thesis_uda_nlp.pdf | 2.65 MB | Adobe PDF | View/Open |
Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.