Please use this identifier to cite or link to this item:
http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19581
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Ζέρβα, Μαρία | - |
dc.date.accessioned | 2025-04-01T21:01:31Z | - |
dc.date.available | 2025-04-01T21:01:31Z | - |
dc.date.issued | 2025-03-17 | - |
dc.identifier.uri | http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19581 | - |
dc.description.abstract | Owing to their exceptional computational performance and cost efficiency, GPUs have solidified their status as the premier platform for accelerating general-purpose workloads. Nonetheless, a subset of these workloads continues to exhibit performance stagnation. The previously proposed Light-weight Out-Of-Order GPU (LOOG) execu- tion scheme addresses this issue by augmenting conventional Thread-Level Parallelism with the exploitation of inherent Instruction-Level Parallelism. Although LOOG has been modeled using GPU simulation tools in previous studies, these implementations have suffered from limited accuracy in power consumption and critical path estima- tions, in addition to slow execution of applications. To overcome these limitations, this thesis proposes integrating LOOG into an RTL GPU framework and specifically Vortex GPU version 2.0, an open-source design that is well-suited for deployment on FPGA platforms. To preserve LOOG’s performance gain in Vortex’s RISC-V–based pipeline, the extension is meticulously designed to com- plement the existing micro-architecture and the operations it supports. Furthermore, a comprehensive investigation of design optimizations and trade-offs is conducted to enhance performance while constraining the overall Area and Power overhead. A detailed characterization of 21 Vortex workloads based on their stalling behav- ior is executed previous to the experimental evaluation, enabling the right-sizing of the micro-architecture across a broad design space that is supported by Vortex’s configura- bility. The results demonstrate an average speedup of up to approximately 23.5%, while maintaining lower Area-Delay and Power-Delay products compared to the in-order Vortex in various configurations. | en_US |
dc.language | en | en_US |
dc.subject | High Performance Computing | en_US |
dc.subject | GPU Micro-Architecture | en_US |
dc.subject | Out-Of-Order Execution | en_US |
dc.subject | RISC-V | en_US |
dc.subject | RTL Design | en_US |
dc.subject | FPGA | en_US |
dc.subject | Hardware Evaluation | en_US |
dc.title | FPGA Design and Analysis of a RISC-V Out-Of-Order GPU | en_US |
dc.description.pages | 117 | en_US |
dc.contributor.supervisor | Ξύδης Σωτήριος | en_US |
dc.department | Τομέας Τεχνολογίας Πληροφορικής και Υπολογιστών | en_US |
Appears in Collections: | Διπλωματικές Εργασίες - Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
maria_zerva_diploma_thesis.pdf | 2.94 MB | Adobe PDF | View/Open |
Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.