Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19554
Title: Co-scheduling algorithms for HPC applications
Authors: Κελλάρη, Μυρσίνη
Γκούμας Γεώργιος
Keywords: High Performance Computing (HPC)
co-scheduling
co-scheduling algorithms
simulation
performance metrics
Issue Date: 7-Mar-2025
Abstract: This thesis explores the development and evaluation of co-scheduling algorithms for High- Performance Computing (HPC) systems, aiming to optimize resource utilization while maintain- ing high system performance and user satisfaction. The growing demand for computational power in fields such as scientific research, artificial intelligence, and big data analytics has made HPC systems essential. However, these systems often suffer from underutilization of resources, leading to increased energy consumption and operational costs. Traditional scheduling algorithms, such as First Come First Serve (FCFS) and EASY, cannot provide a solution. To address these challenges, co-scheduling is proposed as a solution. Co-scheduling allows multiple jobs to share computational nodes, reducing resource contention and improving system efficiency. This is particularly beneficial when co-allocated jobs have different resource demands, such as memory-intensive and compute-intensive tasks, which can lead to improved system performance. However, co-scheduling also introduces challenges, such as inter-job interference and fairness issues, which must be carefully managed. The research introduces several co- scheduling algorithms, including EASY Co-schedule, Largest Area First Co-schedule (LAF-Co), Popularity, Shortest Job First Co-schedule (SJF-Co), Longest Job First Co-schedule (LJF-Co), Filler, and Two Factors. These algorithms are evaluated using the Efficient Lightweight Scheduling Estimator (ELiSE), a Python-based simulator that enables controlled testing of scheduling policies. The evaluation is based on key metrics such as makespan speedup (system performance) and mean job slowdown (user satisfaction). Experimental results demonstrate that co-scheduling algorithms, particularly SJF-Filler (a Two Factors variant), achieve significant improvements in makespan speedup and mean job speedup, while maintaining low mean slowdown values. These algorithms effectively balance system performance and user satisfaction, making them promising candidates for real-world HPC systems. However, co-scheduling can lead to increased execution times for individual jobs, highlighting the trade-off between system efficiency and user experience. The findings suggest that co-scheduling can enhance the performance and efficiency of HPC systems, but careful management is required to ensure fairness and user satisfaction. Future work includes testing the algorithms on real HPC systems, exploring alternative colocation strategies, and integrating machine learning techniques to further optimize scheduling decisions.
URI: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19554
Appears in Collections:Διπλωματικές Εργασίες - Theses

Files in This Item:
File Description SizeFormat 
Kellari_Myrsini_Thesis_final.pdfDiploma Thesis file2.77 MBAdobe PDFView/Open


Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.