Please use this identifier to cite or link to this item:
Title: From Circuits to SoC Processors: Arithmetic Approximation Techniques & Embedded Computing Methodologies for DSP Acceleration
Authors: Λέων, Βασίλειος
Πεκμεστζή Κιαμάλ
Keywords: Approximate Computing
Arithmetic Circuits
Hardware Accelerators
Heterogeneous Computing
Embedded Systems
Digital Signal Processing
Computer Vision
Convolutional Neural Networks
Issue Date: 10-Oct-2022
Abstract: The recent end of Dennard's Scaling and the declining Moore's Law have signified a new era for the computing systems. Power efficiency has now become a critical factor for both cloud and edge computing. Concurrently, the rapid growth of compute-intensive applications from the Digital Signal Processing (DSP) and Artificial Intelligence (AI) domains challenges the resources of computing systems. As a result, the computing industry is forced to find alternative design approaches and computing platforms to sustain increased power efficiency, while providing sufficient performance. Among the examined solutions, Approximate Computing, Hardware Acceleration, and Heterogeneous Computing have gained great momentum. In this Dissertation, we introduce design solutions and methodologies, built on top of the preceding computing paradigms, for the development of energy-efficient DSP and AI accelerators. In particular, we adopt the promising paradigm of Approximate Computing and apply new approximation techniques in the design of arithmetic circuits. Based on our methodology, these arithmetic approximation techniques are then combined with hardware design techniques to implement approximate ASIC- and FPGA-based DSP and AI accelerators. Moreover, we propose methodologies for the efficient mapping of DSP/AI kernels on distinctive embedded devices, such as the new space-grade FPGAs and the heterogeneous VPUs. On the one hand, we cope with the decreased flexibility of the space-grade technology and the technical challenges that arise in new FPGA tools and devices. On the other hand, we unlock the full potential of heterogeneity by surpassing the increased hardware complexity and exploiting all the diverse processors and memories. In more detail, the proposed arithmetic approximation techniques involve bit-level optimizations, inexact operand encodings, and skipping of computations, while they are applied in both fixed- and floating-point arithmetic. To increase the design space and extract the most efficient solutions, we also conduct an extensive exploration on combinations among the approximation techniques. Moreover, we propose a low-overhead scheme for seamlessly adjusting the approximation degree of our circuits at runtime. In comparison with state-of-the-art designs, the proposed arithmetic circuits feature a very large approximation space, i.e., a wide range of approximation configurations, which enable to maximize the resource gains for a given error constraint. At the accelerator level, we develop a plethora of approximate kernels for 1D/2D signal processing and Convolutional Neural Networks (CNNs). Regarding the DSP acceleration on new space-grade FPGAs, we apply our methodology to efficiently map computer vision algorithms onto the radiation-hardened NanoXplore's FPGAs. In the end, we achieve balanced resource utilization, which is comparable to that of well-established FPGA vendors. Furthermore, the throughput is sufficient, considering the performance requirements of vision-based space applications. In terms of Heterogeneous Computing, we accelerate custom DSP kernels, a sophisticated computer vision pipeline, and a demanding CNN on Intel’s Myriad VPUs.
Appears in Collections:Διδακτορικές Διατριβές - Ph.D. Theses

Files in This Item:
File Description SizeFormat 
PhD_LeonV.pdfΒασίλειος Λέων - Διδακτορική Διατριβή ΕΜΠ6.47 MBAdobe PDFView/Open

Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.