Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19537
Full metadata record
DC FieldValueLanguage
dc.contributor.authorΑνδρώνη, Άρτεμις-
dc.date.accessioned2025-03-15T12:09:45Z-
dc.date.available2025-03-15T12:09:45Z-
dc.date.issued2025-03-07-
dc.identifier.urihttp://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19537-
dc.description.abstractRelapse prediction in severe mental health conditions such as bipolar disorder and schizophrenia spectrum disorders (SSD) remains a pressing challenge, often necessitating costly hospitalizations and causing significant disruptions in patients’ lives. Recent advancements in digital phenotyping offer the potential to monitor behavioral and physiological patterns continuously, thus enabling earlier intervention. This thesis builds upon the e-Prevention project--an integrated system designed to support patients with mental health disorders through machine learning-based relapse detection--by expanding its audio database and developing new models that fuse speech and biometric signals. First, we expand the original audio database to include additional patients and relapse cases. We then re-evaluate the Convolutional Autoencoder (CAE) and Convolutional Variational Autoencoder (CVAE) models developed during the e-Prevention project on this expanded dataset, confirming that the larger dataset improves anomaly detection in speech. Subsequently, we introduce LSTM-based autoencoders (LSTMAE, LSTMVAE) to capture temporal dependencies in speech signals, finding that the LSTMAE further enhances predictive performance, whereas the CVAE remains the strongest variational approach. To examine the benefits of multimodal fusion, we align audio recordings from clinical interviews with biometric data (heart rate variability, accelerometer and gyroscope signals) collected from smartwatches. We design joint autoencoder frameworks, that include biometric and audio branches, and combine the learned representations of each modality's encoder into a unified latent space, which yields improved relapse detection compared to unimodal approaches. Experimental results indicate that personalized (patient-specific) models tend to outperform global models, highlighting the importance of tailoring these models to individual patients. Furthermore, ablation experiments through branch disabling validate the contribution of each modality, demonstrating that the joint models effectively leverage both audio and biometric data for improved relapse detection. In summary, this work demonstrates how integrating audio and biometric data through advanced autoencoder architectures can enhance the early detection of relapse in bipolar disorder and SSD, contributing to the efforts aimed at more timely clinical interventions and personalized care for patients with mental health conditions.en_US
dc.languageenen_US
dc.subjectAnomaly Detectionen_US
dc.subjectAutoencoder Architecturesen_US
dc.subjectMental Health Disordersen_US
dc.subjectDigital Phenotypingen_US
dc.subjectMachine Learningen_US
dc.subjectMultimodal Fusionen_US
dc.subjectSpontaneous Speechen_US
dc.subjectBiometric Markersen_US
dc.titleAnalysis of Audio Signals and Biometric Markers for Supporting Patients with Mental Health Disordersen_US
dc.description.pages140en_US
dc.contributor.supervisorΜαραγκός Πέτροςen_US
dc.departmentΤομέας Σημάτων, Ελέγχου και Ρομποτικήςen_US
Appears in Collections:Διπλωματικές Εργασίες - Theses

Files in This Item:
File Description SizeFormat 
thesis-androni-artemis.pdfFinal Thesis PDF8.05 MBAdobe PDFView/Open


Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.