Please use this identifier to cite or link to this item:
http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19578
Title: | Microarchitectural Extension of CGRA Accelerator for Efficient LLM Code Mapping |
Authors: | Kefallinos, Dionysios Ξύδης Σωτήριος |
Keywords: | CGRA Large Language Models Microarchitectural extension Edge acceleration |
Issue Date: | 14-Mar-2025 |
Abstract: | In recent years, the computational demands of Large Language Models (LLMs) have been steadily increasing, driven by their expanding range of applications and the scaling of their parameter sizes. A key emerging trend is the shift of inference workloads closer to the user, leveraging edge devices and specialized agents. In this work, we explore the R-Blocks CGRA accelerator as a potential platform for running such workloads efficiently. Our contributions are twofold: first, we extend the microarchitecture and compilation toolchain (OpenASIP) of R-Blocks to support floating-point arithmetic, necessary for efficient LLM inference; second, we implement and benchmark LLM workloads on the reconfigurable hardware, investigating various architectural choices and parallelization strategies. Finally, we evaluate our design in a 22nm FD-SOI ASIC implementation, providing insights into its performance, energy efficiency, and area footprint, and assessing the viability of our approach for edge-based LLM inference. |
URI: | http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19578 |
Appears in Collections: | Διπλωματικές Εργασίες - Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Dion_Thesis.pdf | Corrected | 3.92 MB | Adobe PDF | View/Open |
Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.