Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/18662
Title: Hardware and Software Co-design for Efficient Memory Access
Authors: Αλβέρτη, Χλόη
Γκούμας Γεώργιος
Keywords: virtual memory, file systems, memory management, persistent memory, operating systems, computer architecture
Issue Date: 5-Dec-2022
Abstract: Virtual memory is a crucial computing abstraction that has stood the test of time. The level of indirection that it introduces, facilitates programming, i.e. creates the illusion that physical memory is vast, linear and private per application or enables the access of I/O devices in memory space, and assists agile resource management. However these fundamental properties do not come for free. Virtual memory assumes that the Operating System (OS) must maintain the abstraction of memory that each process acknowledges, the virtual address space indirection, and map it to actual physical resources. It also assumes that each CPU memory access operation must go through a translation step. None of the above mechanisms is cheap and, if anything, their costs are getting more and more profound.There are four trends that stress the performance of virtual memory today, (i) the meteoric rise in memory demands and capacities, (ii) the seismic shift of users from enterprise data centers to the cloud, (iii) the rapid evolution of high-performance storage devices and (iv) the increasing heterogeneity in both the compute and the store landscape of data-center systems. The first two considerably raise the efficiency bar for address translation and the second two urge us to revisit the legacy virtual memory interfaces semantics and consecutively their design. This thesis contributes in both directions.The exponential growth of global data and the corresponding increase in the memory demands of workloads led virtual memory’s dominant implementation – paging– hit the Address Translation Wall almost a decade ago. In this thesis we show that despite the fact that vendors tripled translation hardware budget since then, e.g. by incorporating larger TLBs and MMU caches or better huge page support, memory-intensive workloads can still spend up to 30% of their execution time in address translation – especially when they run in virtualized environments. To deal with paging’s poor performance scaling, this thesis proposes synergistic software and hardware mechanisms that create and exploit linearity in mappings. We propose Contiguity-Aware (CA) paging, a novel memory management technique that enhances the Operating System’s page fault handler with hints to allocate target pages and create vast contiguous virtual-to-physical mappings per process. CA paging is applicable to both native and nested paging and it maintains all lightweight memory management techniques of a modern OS, i.e demand paging, Copy-On-Write etc, while avoiding any memory reservation or pre-allocation. We implement our proposal in stock Linux and make it publicly available. On the hardware side, to harvest the generated contiguity, we propose Speculative Offset Address Translation (SpOT). SpOT is a micro-architecture engine that exploits the underlying linearity in mappings to predict address translation on the TLB miss path. While most state-of-the-art hardware proposals fail to support virtualization, due to the architectural complexity of tracking and caching arbitrarily sized mappings in two-dimensional execution, SpOT is directly and transparently applicable to both native and virtualized environments –because it works entirely on the micro-architecture level. Combined with CA paging, SpOT reduces the translation overheads of nested paging from ∼16.5% to ∼0.9% on average for memory-intensive workloads, in a design that trades architectural complexity with strong security guarantees.Apart from a physical memory abstraction, virtual memory is also an important interface towards IO devices; file mappings allow applications to access persistent data via memory dereference. However, high-performance storage has evolved significantly the past decade and nowadays devices offer single digit or even sub-microsecond latencies, exposing the kernel software IO stack as a prohibitively expensive data path. In this thesis, we study the case of persistent memory (PMem) and the direct access file interface (DAX). With PMem and DAX, virtual memory can map storage locations directly to user-space, enabling persistent data access via CPU load/store instructions; forming the shortest existing path to storage. Yet in our study we find that virtual memory operations often throttle direct access performance, failing to deliver what the underlying hardware can provide. In this thesis we break down all sources of overhead in using memory as a file interface. We study how the expensive mechanisms of virtual memory are affected by the new fast storage technology or if they even become obsolete. Based on our analysis, we propose a new interface for fast and scalable direct access to persistent data. DaxVM is a POSIX-relaxed file mapping interface for persistent memory, implemented by redesigning virtual memory operations and extending PMem-aware file systems –all changes driven by direct access unique characteristics. DaxVM supports (i) O(1) memory mapping operations via persistent page tables integrated in file system’s inode metadata, (ii) lazy invalidation of the TLBs, (iii) scalable address space management for ephemeral mappings, (iv) elimination of kernel-space durability management support when user-space is in charge and (v) asynchronous storage block pre-zeroing by the file system to accelerate DAX append operations. We implement DaxVM in stock Linux and the ext4-DAX and NOVA file systems and make it publicly available. We evaluate it on a real system equipped with Intel Optane. For multi-threaded workloads that process multiple small files for short intervals, e.g., Apache, DaxVM improves standard mmap performance up to 4.9x. It also reverses the trend that favors read for such setups, outperforming it by up to 1.5x. DaxVM also increases system availability, providing fast boot times for PMem databases, and sustains high throughput even when they run on fragmented file system images.Overall, this thesis revisits today’s virtual memory design and proposes hardware and software techniques that extend it to (i) scale better with the ever increasing memory capacities, through efficient address translation, and (ii) form a dedicated file mapping interface to push performance to the limits of what the underlying hardware can provide for direct access to persistent data.
URI: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/18662
Appears in Collections:Διδακτορικές Διατριβές - Ph.D. Theses

Files in This Item:
File Description SizeFormat 
ThesisFinalTinos.pdf6.54 MBAdobe PDFView/Open


Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.