Please use this identifier to cite or link to this item:
|Title:||Interpretable Deep Learning Models for Visual Data Analysis|
|Abstract:||The first part of this dissertation focuses on developing Deep Neural Networks for applications in several domains, such as Facial Expression Recognition, Medical Image Analysis and Autonomous Driving.One of the domains where we focused on was that of Computer-Aided Diagnosis. We tried to build Neural Networks with both Convolutional an Recurrent components to properly capture the relationships in the volumetric input data. As an extension to the previous model, we modified its training procedure so that it can adapt to new data, without forgetting its previous training. The technique involves clustering the highest-level features extracted from the CNN. Each cluster is associated to a class; a sample is then classified depending on what cluster is closest to that sample's features. Finally, by giving an expert interpretation to each cluster, the model's predictions can be somewhat explained, giving the model a level of Interpretability.Through the various applications we examined, we ascertained the importance of Interpretability in Neural Networks. A phenomenon has been observed in literature, where increase in Interpretability leads to decreased performance (or Fidelity as it's referred). This is commonly referred to as the Fidelity-Interpretability Tradeoff. We began by trying to define the concepts mentioned above, as well as some metrics for evaluation. More specifically we defined two metrics: the Fidelity-Interpretability Ratio and the Fidelity-Interpretability Index.Our first study involved an extension of the classic Class Activation Mapping technique, so that it provides more Interpretable maps. The idea was the to train a complementary network that would learn to produce high-resolution maps. By combining these with their original lower-resolution counterparts, we can generate very fine and sharp maps of the object that the CNN is “looking at” when it makes a prediction.A second, lower-level approach to NN Interpretability, involved fundamentally rethinking the whole way these models work. Our goal was to train a network that could achieve the highest level of Interpretability without sacrificing any of its Fidelity. To accomplish this, we designed an architecture comprised of two parts: a “hider”, whose goal is to mask part of the input and a “seeker”, who tries to classify the masked input. These two models are trained in a collaborative fashion on two objectives. The first is minimizing a standard classification loss, while the second is maximizing the percentage of the input that is masked.|
|Appears in Collections:||Διδακτορικές Διατριβές - Ph.D. Theses|
Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.