Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/18175
Title: Deep Reinforcement Learning for Tail-Latency Regulation in Co-located Applications through Cooperative Core and Cache Allocation
Authors: Κιμωνίδης, Αλέξανδος
Σούντρης Δημήτριος
Keywords: cloud computing, resource management, deep reinforcement learn- ing, scheduling, performance monitoring counters
Issue Date: 25-Oct-2021
Abstract: The amount of workloads ran on the Cloud is growing all the time. Data center operators and cloud providers have embraced workload co-location and multi- tenancy as first-class system design concerns to efficiently service and manage these massive computing needs. Current state-of-the-art resource managers place applications on the available pool of resources using standard metrics such as CPU or memory usage. As a result, current state-of-the-art resource managers fail to achieve adequate resource utilization. In this thesis, we design a resource manager that leverages deep reinforce- ment learning for its policy and uses performance monitoring counters which are a more complex metric that is able to determine a machine's current state. We showcase the impact of applying stress on different server resources and the need for a better scheduler that considers the correct metrics. We integrate our solution with OpenAI Gym, one of the most widely used tool-kits for devel- oping and comparing reinforcement learning algorithms, and we show that we can achieve higher resource usage compared to the default scheduler as well as other state-of-the-art schedulers.
URI: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/18175
Appears in Collections:Διπλωματικές Εργασίες - Theses

Files in This Item:
File Description SizeFormat 
Diploma_Thesis_Kimonides__Version_1227.pdf5.1 MBAdobe PDFView/Open


Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.