Microarchitectural Extension of CGRA Accelerator for Efficient LLM Code Mapping

Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19578

Title:	Microarchitectural Extension of CGRA Accelerator for Efficient LLM Code Mapping
Authors:	Kefallinos, Dionysios Ξύδης Σωτήριος
Keywords:	CGRA Large Language Models Microarchitectural extension Edge acceleration
Issue Date:	14-Mar-2025
Abstract:	In recent years, the computational demands of Large Language Models (LLMs) have been steadily increasing, driven by their expanding range of applications and the scaling of their parameter sizes. A key emerging trend is the shift of inference workloads closer to the user, leveraging edge devices and specialized agents. In this work, we explore the R-Blocks CGRA accelerator as a potential platform for running such workloads efficiently. Our contributions are twofold: first, we extend the microarchitecture and compilation toolchain (OpenASIP) of R-Blocks to support floating-point arithmetic, necessary for efficient LLM inference; second, we implement and benchmark LLM workloads on the reconfigurable hardware, investigating various architectural choices and parallelization strategies. Finally, we evaluate our design in a 22nm FD-SOI ASIC implementation, providing insights into its performance, energy efficiency, and area footprint, and assessing the viability of our approach for edge-based LLM inference.
URI:	http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19578
Appears in Collections:	Διπλωματικές Εργασίες - Theses

Files in This Item:

File	Description	Size	Format
Dion_Thesis.pdf	Corrected	3.92 MB	Adobe PDF	View/Open