Please use this identifier to cite or link to this item:
|Title:||Resource Aware GPU Scheduling in Kubernetes Infrastructure|
|Abstract:||Nowadays, there is an ever-increasing number of Artificial Intelligence (AI) and Machine Learning (ML) workloads pushed and executed on the Cloud. To effectively serve and manage these huge computational demands, data center operators and cloud providers have provisioned GPU resources at the scale of thousands of nodes. Since GPUs are relatively new to the cloud stack, support for efficient GPU management lacks, as state-of-the-art schedulers and orchestrators treat GPUs only as a specific resource constraint while ignoring its unique characteristics and application properties. In addition, users tend to request more GPU resources than they actually need, leading to resource under-utilization. In this thesis, we design a resource aware GPU scheduling system, able to efficiently colocate applications on the same card arriving at a data center. We integrate our solution with Kubernetes, one of the most widely used cloud orchestration frameworks nowadays. We show that our scheduler can achieve better quality of service (QoS) and higher resource utilization compared to the state-of-the-art schedulers, for a variety of ML cloud representative workloads.|
|Appears in Collections:||Διπλωματικές Εργασίες - Theses|
Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.