Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/17662
Title: Design and implementation of an intelligent mechanism capable of sharing resources, in multicore systems, using Deep Reinforcement Learning
Authors: Mandilaras, Nikiforos
Κοζύρης Νεκτάριος
Keywords: Multiprocessors, Shared cache, LLC, Cache partitioning, coexecution, Intel RDT, Reinforcement Learning, Neural Nets, Deep Reinforcement Learning, DQN
Issue Date: 31-Aug-2020
Abstract: The average usage of servers in modern data centers is extremely low, not exceeding 50 %. The reason for this, is the Service-Level Agreements (SLAs) that the providers sign with their customers. In order to ensure those agrements, the isolated execution of the services is preferred. The need for isolation arises due to the competition for shared resources, such as the last level cache memory. The competition that occurs between the coexecuted applications, negatively affects the performance of the services and calls into question the maintenance of their level of performance. To deal with such situations, technologies have now been integrated into modern processors, that provide support for usage monitoring as well as for partitioning of shared resources. In the present thesis, we utilize these technologies along with deep reinforcement learning methods, in order to implement an intelligent mechanism for partitioning the last level cache of a multicore system. The goal is to maintain the performance of a latency critical service when it is coexecuted with other applications, but also to increase the utilization of system resources. Reinforcement learning enables the automated implementation of such goals, using agents who explore a state space and utilize the knowledge they gather from the environment, in order to make the appropriate decisions and achieve their ultimate goal. We evaluate our mechanism in coexecutions of Memcached service with machine learning workloads. We prove that the mechanism can consistently protect the performance of the critical service and at the same time increase the throughput of low priority applications. Finally, we show that the training of neural networks offers opportunities to generalize the acquired knowledge and use it in new applications.
URI: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/17662
Appears in Collections:Μεταπτυχιακές Εργασίες - M.Sc. Theses

Files in This Item:
File Description SizeFormat 
thesis_resource_allocation_reinforcement_learning_nikiforos_mandilaras.pdf2.5 MBAdobe PDFView/Open


Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.