Παρακαλώ χρησιμοποιήστε αυτό το αναγνωριστικό για να παραπέμψετε ή να δημιουργήσετε σύνδεσμο προς αυτό το τεκμήριο:
http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19234
Τίτλος: | Explaining Multimodal Music Emotion and Genre Recognition |
Συγγραφείς: | Σωτήρου, Θεόδωρος Στάμου Γιώργος |
Λέξεις κλειδιά: | Music Information Retrieval Deep Learning Multimodality Music Genre Classification Local Explanations Multimodal Explainability |
Ημερομηνία έκδοσης: | 17-Ιου-2024 |
Περίληψη: | Music Information Retrieval (MIR) is a field of research concerned with the extraction and analysis of information from music. Among other tasks, it includes music regression/classification and specifically mood detection and genre recognition. Alongside the growth seen in artificial intelligence (AI) fields, MIR has also experienced significant advancements, including the availability of extensive datasets, the integration of new technologies and multimodal approaches as well as the development and application of advanced explainability methods. In this thesis, we dive into explaining music emotion and genre classification multimodal models. Firstly we look for available datasets that provide multimodal and multi task capabilities. We choose Music4All [54], offering lyrics and audio as well as emotion and genre metadata for each song and proceed by analysing, refining and slightly augmenting this work. We continue by utilizing pretrained transformer architectures, namely Robustly Optimized BERT Pretraining Approach (RoBERTa) and Audio Spectrogram Transformer (AST), so as to classify music creations into 9 distinct emotion and genre categories utilizing their lyrics, their audio and a combination of the two. Finally, we look for methods to explain each model and propose a way to generate multimodal explanations from lyrics and audio, using the power of LIME [51] and its audio implementation auioLIME [25]. Finally we generate global aggregates [35] of LIME explanations, providing insights into the models performance and the models ability to detect themes and elements distinct for each class. |
URI: | http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19234 |
Εμφανίζεται στις συλλογές: | Διπλωματικές Εργασίες - Theses |
Αρχεία σε αυτό το τεκμήριο:
Αρχείο | Περιγραφή | Μέγεθος | Μορφότυπος | |
---|---|---|---|---|
Diploma_Sotirou_Final.pdf | 4.61 MB | Adobe PDF | Εμφάνιση/Άνοιγμα |
Όλα τα τεκμήρια του δικτυακού τόπου προστατεύονται από πνευματικά δικαιώματα.