Development of Interpretable Machine Learning Models to Support Diabetes Management

Αθανασίου, Μαρία

National Technical University of Athens

School of Electrical and Computer Engineering

Artemis is Live!

Welcome to our digital repository! The aim of Artemis is the systematic archiving and dissemination of the scientific work produced in the School of Electrical and Computer Engineering, National Technical University of Athens, Greece, using the technology of digital libraries.

Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/18671

Title:	Development of Interpretable Machine Learning Models to Support Diabetes Management
Authors:	Αθανασίου, Μαρία Νικήτα Κωνσταντίνα
Keywords:	Diabetes Mellitus Glucose Insulin Artificial Pancreas Prandial Bolus Machine Learning Risk Prediction Interpretability Unbalanced Data Ensemble Learning Cardiovascular Risk Diabetic Ketoacidocis Hospitalization Risk
Issue Date:	14-Feb-2023
Abstract:	The present thesis aims at the design, development, and evaluation of interpretable machine learning models to support decision making in Health. The proposed methods leverage heterogeneous data along with human-centered Artificial Intelligence (AI) technologies and address issues such as the unbalanced nature of the available data and the need to produce interpretable decisions towards the development of novel methodological frameworks that enable the realization of reliable decision support systems in Health. Considering the epidemiological model of Diabetes Mellitus (DM) and the range of clinical use cases it entails, the metabolic disorder of DM is selected for the models’ development and evaluation. More specifically, data from Electronic Health Records (EHR), laboratory measurements, and glucose-insulin records are utilized towards the development of interpretable risk prediction models able to support healthcare professionals in making informed decisions regarding the health status of people with DM as well as computational systems empowering people with DM in achieving optimal glycemic control. The first part of the thesis focuses on the development of interpretable prediction models for (i) the risk incidence of Cardiovascular Disease (CVD) in patients with Type 2 Diabetes Mellitus (T2DM) and (ii) the risk assessment of hospitalization and re-hospitalization due to Diabetic Ketoacidosis (DKA) or Hyperglycemia with Ketosis (HK) in patients with Type 1 Diabetes Mellitus (T1DM). To handle the unbalanced nature of the used datasets, an ensemble learning strategy is adopted towards the generation of multiple individual models and the combination of their decisions for the calculation of the final risk scores. Explanations on the models’ decisions are produced through leveraging the SHapley Additive exPlanations (SHAP) method and the Local Interpretable Model-agnostic Explanations (LIME) method. The development and evaluation of computational models able to assess the CVD risk incidence in patients with T2DM is based on data collected from a 5-year follow up of 560 T2DM individuals at the Hippokration General Hospital of Athens. The predictive power of Self- Organizing Maps (SOM) and Hybrid Wavelet Neural Networks (HWNNs) along with the use of various combination schemes are firstly investigated towards building different ensemble models. The proposed ensemble learning strategy is subsequently deployed together with the XGBoost algorithm and the Tree SHAP interpretability method towards the development of an interpretable risk prediction model for the CVD incidence in patients with T2DM. In terms of the interpretable model for the assessment of hospitalization and re-hospitalization risk due to DKA or HK in youth with T1DM, data collected from a two-year follow-up of 127 T1DM patients at the “Agia Sofia” Children’s Hospital within the framework of the “SWEET” Initiative, are used for development and evaluation purposes. Frequently identified risk factors for recurrent hospital admissions due to DKA or HK are considered to compose the model’s input space. Long Short-Term Memory Neural Networks (LSTM) and their efficiency in handling sequential data are leveraged for building the ensemble model while the LIME method is deployed towards the generation of explanations on the ensemble model’s decisions. The models’ predictive performance is assessed in terms of discrimination and calibration. An explanatory analysis is also carried out to provide evidence regarding the proposed methods’ ability to capture risk factors’ influence and underlying interactions’ effects. The regulation of postprandial glucose response after meal ingestions constitutes an arduous task towards achieving optimal glycemic control. With the aim of addressing this challenge, the second part of the thesis proposes personalized systems for automated meal detection and the estimation of prandial insulin boluses in people with T1DM applying Continuous Glucose Monitoring Systems (CGMS) and Continuous Subcutaneous Insulin Infusion Pumps (CSIIP). Data generated from the in silico patients of the UVA Padova T1DM Simulator are used for the development and evaluation of the proposed systems. The development of personalized computational models for the detection of meal disturbances in people with T1DM is based on the deployment of an ensemble learning strategy and LSTM, which are leveraged due to their ability to efficiently handle time-series data. Glucose measurements provided by the CGMS as well as information about the ingested meals are considered for composing the models’ input space. The models are assessed in terms of their discrimination ability and speed of detection as well as their ability to effectively handle the inter-subject variability among patients with T1DM. A personalized insulin bolus recommendation system for people with T1DM is subsequently presented. The system aims at effectively handling meal disturbances by leveraging a personalized approach, able to adjust to the specific parameters and needs that each patient may have, with the aim of maintaining postprandial blood glucose levels within the normal range. The development of the insulin bolus recommendation system relies on the combined use of Case-Based Reasoning (CBR) and SOM. By utilizing CBR, the solution to a new problem (i.e., new meal) is based on the solutions (i.e., prandial insulin boluses) of similar past problems (i.e., past meals). SOM are deployed to cluster individual meal cases and enable for each query case the identification and retrieval of similar cases towards the calculation of an optimal prandial insulin bolus. The system is assessed in terms of its ability to effectively handle meal disturbances as well as the inter- and intra-subject variability.
URI:	http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/18671
Appears in Collections:	Διδακτορικές Διατριβές - Ph.D. Theses

Files in This Item:

File	Description	Size	Format
PhD thesis athanasiou vfff_lib.pdf		10.42 MB	Adobe PDF	View/Open

Show full item record