Please use this identifier to cite or link to this item:
http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19694
Title: | Machine-Learning-based prediction of human SERT protein ligand affinity using molecular docking and interaction analysis |
Authors: | Papanagnou, Dimitrios Ματσόπουλος Γιώργος |
Keywords: | Serotonin Transporter (SERT) Molecular Docking Supervised Classification Machine Learning Binding Affinity |
Issue Date: | 25-Jun-2025 |
Abstract: | The following thesis presents a multi-disciplinary computational approach for classifying potential inhibitors of the human serotonin transporter (SERT) into three distinct categories: strong binders, moderate binders, and non-binders. SERT is the primary target for many antidepressants. The pipeline integrates several steps including molecular docking, molecular descriptor analysis, residue-level interaction profiling and creation of a supervised machine learning model in order to extract and clarify ligand-SERT interactions. A total of 74 compounds with and without known pharmacological action were studied that belong mainly to wide antidepressant categories, such as SSRIs, SNRIs, TCAs and other unrelated categories which are considered non-binders. The categorization of these ligands into the 3 classes was assigned based on the available inhibition constant values (Ki) with human SERT receptor from authorized pharmacological sources. Initially, molecular docking was employed with the aid of “AutoDock Vina” and “Chimera” software to generate the top ten binding poses for each ligand. The validity of the docking process and protocol was assessed by comparing the predicted binding conformation of the known SSRI drug “Paroxetine” with the baseline crystallographic structure from Protein Data Bank (PDB: 5I6X), resulting in a nearly perfect alignment. A custom Python script was applied to select the top five out of ten poses by ranking them based on their binding affinity and root-mean-square deviation values (RMSD). Extensive molecular and residue details were obtained using “BIOVIA Discovery Studio”, including “Surface_Area”, geometric angles and distance-based features. Statistical analyses were conducted to examine the correlations of features with the target variable, which is the class label and to detect potential multicollinearity among them. Notably, for strong binders, hydrophobic residues, such as “ALA_173”, “ILE_172”, and “PHE_341” were found to be critical. Apart from these, distinct distributions of “Polar_Surface_Area” and other angular features like “ANGLE_HAY” and “GAMMA” were observed. Several machine learning algorithms were trained including Random Forest, XGBoost, LightGBM, Logistic Regression, SVM and Voting Classifier. Nested cross-validation technique was integrated to minimize the risk of overfitting, however performance was moderate, due to the overlapping descriptor distributions between moderate and adjacent classes. Tree-based models outperformed, while at the same time facilitated interpretability of model decisions through SHAP summary and partial dependence plots. These plots highlighted the most predictive and important features across “STRONG BINDING” and “MODERATE BINDING” classes and confirmed that moderate binders confused the model. Despite the controversial success of the models used, the assumptions and limitations under which the present thesis was conducted, are outlined. Most decisive of them is the limited sample size, the static docking simulations and the custom script that selected the five best poses. Nevertheless, the study suggests for future work to incorporate molecular dynamics simulations from a wider range, include more targeted receptors for docking that are responsible for antidepressant activity, such as NET and DAT and molecular fingerprints that capture atomic level interactions. By this method, the classification accuracy and validity of results will be indisputable. This thesis lays a foundation for an innovative plan for detecting potential antidepressants drugs with the aid of several computational tools, but it requires a lot of optimizations to be considered reliable. |
URI: | http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19694 |
Appears in Collections: | Μεταπτυχιακές Εργασίες - M.Sc. Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Papanagnou_Dimitrios_25_June_2025.pdf | Diploma Thesis | 7.69 MB | Adobe PDF | View/Open |
Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.