Deep Reinforcement Learning for Tail-Latency Regulation in Co-located Applications through Cooperative Core and Cache Allocation

Κιμωνίδης, Αλέξανδος

National Technical University of Athens

School of Electrical and Computer Engineering

Artemis is Live!

Welcome to our digital repository! The aim of Artemis is the systematic archiving and dissemination of the scientific work produced in the School of Electrical and Computer Engineering, National Technical University of Athens, Greece, using the technology of digital libraries.

Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/18175

Title:	Deep Reinforcement Learning for Tail-Latency Regulation in Co-located Applications through Cooperative Core and Cache Allocation
Authors:	Κιμωνίδης, Αλέξανδος Σούντρης Δημήτριος
Keywords:	cloud computing, resource management, deep reinforcement learn- ing, scheduling, performance monitoring counters
Issue Date:	25-Oct-2021
Abstract:	The amount of workloads ran on the Cloud is growing all the time. Data center operators and cloud providers have embraced workload co-location and multi- tenancy as first-class system design concerns to efficiently service and manage these massive computing needs. Current state-of-the-art resource managers place applications on the available pool of resources using standard metrics such as CPU or memory usage. As a result, current state-of-the-art resource managers fail to achieve adequate resource utilization. In this thesis, we design a resource manager that leverages deep reinforce- ment learning for its policy and uses performance monitoring counters which are a more complex metric that is able to determine a machine's current state. We showcase the impact of applying stress on different server resources and the need for a better scheduler that considers the correct metrics. We integrate our solution with OpenAI Gym, one of the most widely used tool-kits for devel- oping and comparing reinforcement learning algorithms, and we show that we can achieve higher resource usage compared to the default scheduler as well as other state-of-the-art schedulers.
URI:	http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/18175
Appears in Collections:	Διπλωματικές Εργασίες - Theses

Files in This Item:

File	Description	Size	Format
Diploma_Thesis_Kimonides__Version_1227.pdf		5.1 MB	Adobe PDF	View/Open

Show full item record