Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/18181
Title: Embedded Development of AI-based Computer Vision: Acceleration on Intel Myriad X VPU
Authors: Μηναΐδης, Παναγιώτης
Σούντρης Δημήτριος
Keywords: Heterogeneous Architectures
Embedded Systems
Myriad X
Computer Vision
Convolutional Neural Networks
Pose Estimation
Issue Date: 8-Nov-2021
Abstract: It is estimated that, by 2022, 82% of the packets transferred through the Internet will contain video data. The real-time processing of these data is a rather attractive prospect, that can lead to the creation of very interesting systems, commercial or otherwise. Convolutional Neural Networks (CNNs) are an important tool in this direction, as their recent rapid growth has resulted in some impressive solutions to classic computer vision problems. On the other hand, traditional embed- ded systems cannot support the increased requirements of CNNs in computational or memory resources. In this environment, an upcoming class of microprocessors, the Vision Processing Units, are developed. Myriad X is the latest installment in the family of VPUs offered by Intel/Movidius. It is a multicore, heterogeneous computing system, with a dedicated hardware accelerator for deep learning applications, and high performance per unit of power. However, most modern neural networks are developed, based on the performance of much more potent processing systems, and emphasize on accuracy rather than efficiency. This is the basis of many networks that attempt to solve the problem of estimating and tracking the pose of a satellite, more commonly known as the "Lost in Space" problem. In this thesis, we studied several different resampling methods on the input data, in order to determine how they affect the total number of computations and parameters of a CNN, as well as its accuracy. Multiple optimization techniques were utilized, including the exploitation of the on-chip Scratchpad Memory and the SIMD utilities of the Myriad X VPU, so as to avoid creating a bottleneck during this preprocessing stage. The preprocessed data are fed into a CNN, named "UrsoNet", which locates the position of a satellite on the input image and estimates its pose. To measure the power requirements of this application, a custom Power Measurement system is introduced, which can also perform static power management. Finally, a hybrid system is proposed. This system utilizes the CNN for the estimation of the initial pose of the satellite and, consequently, runs a classic, pipelined CV algorithm, that evolves and re fines this initial pose in real-time. The results are highly encouraging, since the execution time required for a single inference is reduced up to 5 times, provided that proper preprocessing of the input frames is applied, with no noticeable degradation in accuracy. This allows for real-time execution on Myriad X, on a tight power envelope. Speci cally, we achieve 2.12 - 2.22 FPS, depending on the scale on which the preprocessing takes place, with a mean power consumption of less than 2 Watts. The proposed hybrid system operates with an overhead of about 373.3 - 391.3 ms for the initial estimation and then requires approximately 263 - 388 ms to continue tracking the pose of the satellite, resulting in a throughput of 2.58 - 3.80 FPS.
URI: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/18181
Appears in Collections:Διπλωματικές Εργασίες - Theses

Files in This Item:
File Description SizeFormat 
Minaidis_Thesis.pdf16.6 MBAdobe PDFView/Open


Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.