Please use this identifier to cite or link to this item:
|Title:||Exploiting Partial Reconfiguration Of Soc Fpgas: A Hardware-software Co-design For Accelerating Cryptographic Systems|
|Abstract:||In recent years, the continued push to gain the best computing performance possible has led to therealization of Heterogeneous computing and Heterogeneous platforms. These systems gainperformance and energy efficiency by adding dissimilar accelerators as co-processors with specializedprocessing capabilities, to handle specific intensive tasks. Field Programmable Gate Arrays (FPGAs)have gained the interest of system architects due to their rapid prototyping and fast acceleratordeveloping capabilities. As their name denotes, FPGAs are programmable "in the field", meaning thattheir internal logic can be configured after the fabrication process and modified, if needed, withoutgoing to re-fabrication process, as common ASICs. Partial Reconfiguration (PR) takes this flexibilityone step further, by allowing an operating FPGA design to modify a part of itself, while the rest of thesystem continues to function normally, without compromising the integrity of the computation runningon those parts of the device that are not being reconfigured. This technique leads to reduction of theamount of resources required to implement a given function, with consequent reductions in cost andpower consumption, provides flexibility in the algorithms/protocols available to an application andaccelerates computing by enabling a design to be ready to correspond to new computationrequirements much faster. This thesis tried to explore the PR technology on FPGAs and apply theknowledge acquired to implement a cryptographic system on a Xilinx Zynq-7000 SoC device. Zynqcombines the coexistence of programmable logic and an embedded ARM processor on a single chip,thus forming a system-on-a-chip (SoC), while enabling fast interconnection between them and powerefficiency. For the purposes of this thesis we chose four cryptographic modules (AES128, AES192,AES256 and SHA3-512). Firstly, we made all the appropriate modifications needed to utilize thecryptographic modules in the SoC and designed the appropriate AXI4-Stream compliant interfaces toenable communication between the peripherals and the processor, with respective compromises to thedifferent modules’ architecture, the processing system’s limitations and PR’s restrictions. Then, weestablished connection between the peripherals and the processing system through an AXI DMA IP inScatter/Gather mode. Scatter/Gather resulted in a high-speed communication and applied interruptcoalescing strategy to reduce the number of interrupts occupying the ARM, thus it allowed theprocessor to handle the peripherals more efficiently. We also applied decoupling strategy to isolate thereconfigurable modules during PR to avoid undesirable outcoming signals to affect the rest of thedesign. Finally, we made an evaluation of our work and constructed a benchmark to show theacceleration advantages of PR. In this benchmark, the system could adapt to computation requirementsand reconfigured idle peripherals with others that were needed, to distribute the computational loadbetween them and so, to reduce the total computation time. As a result, we achieved almost fullhardware utilization and approximated the optimal speedup|
|Appears in Collections:||Διπλωματικές Εργασίες - Theses|
Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.