Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19793
Title: LUMAX: A LUT-Based Mixed-Precision Accelerator for LLM Inference on the Edge
Authors: Ιωάννου, Κωνσταντίνος
Σούντρης Δημήτριος
Keywords: LUT
Mixed-Precision
Accelerator
GEMM
Low- bit LLM
Issue Date: 29-Sep-2025
Abstract: In recent years, the rapid growth of large language models (LLMs) has increased demand for efficient inference on both datacenter and edge platforms. While quantization reduces computation and memory costs, mixed-precision operations, where activations remain in higher precision while weights are quan- tized to lower bitwidths, remain inefficient on general-purpose hardware. Lookup Table (LUT)-based methods offer a promising alternative, yet achieving an optimal balance of memory usage, flexibility, and workload adaptability remains challenging. We propose LUMAX, a fully integrated LUT-based mixed-precision GeMM accelerator for energy-efficient LLM inference. LUMAX features a reconfigurable hardware design, allowing for efficient support of different activation and weight bitwidths. To reduce LUT overhead, we employ a quarter-size LUT (¼-LUT) with efficient indexing and data packaging, minimizing storage and data transfer. LUMAX has been implemented as a tightly cou- pled RocketChip Co-processor (RoCC), thus enabling seamless processor integration with RISC-V cores. By extending key ideas from recent LUT-based designs and combining them with full processor integration and reconfigurable hardware, LUMAX provides a flexible, power-efficient accelerator for quantized LLM inference, blending hardware adaptability, software usability, and architectural efficiency. Evaluation results show that LUMAX, prototyped on a ZCU106 FPGA, reduces LUT and DSP usage by up to 33% and 96%, achieves 79% fewer cycles, and delivers up to 4.7× speedup on LLaMA2, with up to 70% improved energy efficiency over prior GeMM accelerators such as Gemmini
URI: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19793
Appears in Collections:Διπλωματικές Εργασίες - Theses

Files in This Item:
File Description SizeFormat 
kostis_ioannou_thesis.pdf13.74 MBAdobe PDFView/Open


Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.