Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19792
Full metadata record
DC FieldValueLanguage
dc.contributor.authorNikodimos, Provatas-
dc.date.accessioned2025-10-14T08:19:46Z-
dc.date.available2025-10-14T08:19:46Z-
dc.date.issued2025-07-08-
dc.identifier.urihttp://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19792-
dc.description.abstractDeep learning has transformed numerous fields by leveraging vast datasets and complex neural architectures, but the computational demands of modern models often exceed single-node capabilities, prompting distributed training solutions. This thesis investigates asynchronous training under the parameter server paradigm, focusing on enhancing both performance and stability. First, a thorough comparative analysis demonstrates that specialized distributed architectures deliver substantially higher throughput than general-purpose data-processing frameworks at large scales. Following a systematic literature review, consistency control and the mitigation of stale gradients as pivotal challenges in asynchronous setups are identified. To address these, a hybrid Strategy-Switch approach is introduced that begins with synchronous communication to identify a promising solution region before transitioning to asynchronous updates based on an empirically derived switching criterion, achieving both rapid convergence and model accuracy. Building on these insights, offline data-sharding techniques are then proposed, designed to preemptively balance sample distributions across workers, thereby reducing gradient variance and improving training consistency. Experimental results show that the proposed data distribution strategies decrease variability in training and validation metrics by up to eightfold and twofold, respectively, compared to random assignment. Collectively, these contributions advance asynchronous distributed deep learning by offering concrete methods to reconcile speed and stability, supporting more scalable and reliable large-scale neural network training.en_US
dc.languageenen_US
dc.subjectparameter serveren_US
dc.subjectεξυπηρετητής παραμέτρωνen_US
dc.subjectdeep learningen_US
dc.subjectβαθιά μηχανική μάθησηen_US
dc.subjectdistributed learningen_US
dc.subjectκατανεμημένη εκπαίδευσηen_US
dc.subjectasynchronous learningen_US
dc.subjectασύγχρονη εκπαίδευσηen_US
dc.subjectdata managementen_US
dc.subjectδιαχείριση δεδομένωνen_US
dc.subjectbig dataen_US
dc.subject"μεγάλα" δεδομέναen_US
dc.titleStudy and optimization of distributed deep learning under the parameter server architectureen_US
dc.description.pages188en_US
dc.contributor.supervisorΚοζύρης Νεκτάριοςen_US
dc.departmentΤομέας Τεχνολογίας Πληροφορικής και Υπολογιστώνen_US
Appears in Collections:Διδακτορικές Διατριβές - Ph.D. Theses

Files in This Item:
File Description SizeFormat 
phd_thesis_final_Sep (3).pdf9.13 MBAdobe PDFView/Open


Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.