Elastic Resource Management in Microservices Architectures: A Hybrid Approach Combining Reinforcement Learning, Supervised Learning and Critical Path Extraction

Τσικριτέας, Παναγιώτης

National Technical University of Athens

School of Electrical and Computer Engineering

Artemis is Live!

Welcome to our digital repository! The aim of Artemis is the systematic archiving and dissemination of the scientific work produced in the School of Electrical and Computer Engineering, National Technical University of Athens, Greece, using the technology of digital libraries.

Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19913

Title:	Elastic Resource Management in Microservices Architectures: A Hybrid Approach Combining Reinforcement Learning, Supervised Learning and Critical Path Extraction
Authors:	Τσικριτέας, Παναγιώτης Κοζύρης Νεκτάριος
Keywords:	Kubernetes Resource Allocation Elasticity Cloud Computing Machine Learning Reinforcement Learning DeathStarBench FIRM CRISP Microservices Architectures
Issue Date:	7-Nov-2025
Abstract:	The increasing computational demands and scalability limitations of monolithic architectures have driven the adoption of microservices-based architectures where applications are composed of loosely coupled deployable services. To manage that complexity in an automated way, platforms like Kubernetes have become both the industry and academic standard due to their resilience, scalability and versatile support of orchestrating these microservices-based architectures, especially with the introduction of scalers like Horizontal Pod Autoscaler (HPA). However, these scalers rely on simplistic thresholdbased heuristics which do not respond well to the complex workload patterns these microservices-based systems encounter. To address this limitation, this thesis’s main objective is to propose a resource management pipeline, that combines supervised learning with probabilistic calibration to identify the critical components extracted from the system’s trace data and reinforcement learning to allocate Kubernetes resources effectively. This work uses CRISP to extract the constantly changing critical paths of the system, guiding the decision-making of the reinforcement learning agent accordingly. To assess the aforementioned information extracted, this thesis compares three reinforcement learning agents, Proximal Policy Optimization (PPO), Trust Region Policy Optimization (TRPO) and Synchronous Advantage Actor-Critic (A2C), with the PPO agent being trained with two different number of episodes, evaluating their ability to manage Kubernetes resources efficiently with metrics such as the end-to-end latency on different percentiles (50, 95, 99) and the average deployed Pods in the cluster while at the same time highlighting each agent’s limitations based on the aforementioned metrics. The evaluation results demonstrate that long-term training in such complex systems is necessary in order to obtain the ability to allocate resources optimally. Notably, even with limited training, the agents achieved strong performance compared to each other and the KHPA baseline, showcasing the importance of such agents in Kubernetes clusters.
URI:	http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19913
Appears in Collections:	Διπλωματικές Εργασίες - Theses

Files in This Item:

File	Description	Size	Format
diploma_thesis_panagiotis_tsikriteas.pdf		5.22 MB	Adobe PDF	View/Open

Show full item record