Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/17173
Title: Design Methodologies for Resource Management of Many-core Embedded Systems
Authors: Tsoutsouras, Vasileios
Σούντρης Δημήτριος
Keywords: Distributed Run-Time Resource Management
Edge computing
Internet of Things (IoT)
IoT Gateway
Issue Date: 6-Jul-2011
Abstract: The current status of embedded systems contains a variety of complex computing devices featuring high-end, architecturally rich processors, heterogeneous devices and many-core systems. Furthermore, new computing architectures have been proposed at the system level, extending the concept of Internet of Things (IoT) to a multi-layer distributed infrastructure, known as Edge (or Fog) computing. This infrastructure stems from the intention to mitigate a number of inefficiencies of the original Cloud-centric deployment of IoT systems, suffering from dependency on Cloud resources, connectivity issues and unacceptably high bandwidth requirements. In such deployments, the numerous, involved computing nodes must cooperate in order to execute the variety of input tasks resulting from the highly dynamic setup of the system, which includes mobile users and unpredictable application execution requests. The developed IoT applications, must also be designed under the consideration of the updated distributed architecture to be able to fully take advantage of it. The course of this dissertation, begins by focusing on the requirements and design of embedded applications, operating on systems of multiple nodes. The target applications belong to the medical domain and thus their design requirements include but are not limited to performance, since dependability and accuracy of operations are critical in this field. The design of the IoT-oriented applications is also performed in a modular, pipelined manner in order to provide different runtime configuration knobs, for the effective operation of the device in a Gateway based offloading environment. Automated HW/SW co-design approaches using High Level Synthesis are employed in order to provide a version of the developed applications that is capable of using HW accelerators on combined CPU-FPGA Systems-on-Chip, that are able to significantly decrease the execution latency of the computationally intensive parts of the application. With respect to the design choice of the IoT Gateway, a many-core embedded system with Network-on-Chip topology is considered as a promising design alternative to meet the computational and communicational requirements, resulting from the interaction of the Gateway with numerous IoT nodes. An efficient run-time decision making mechanism is necessary for the manycore system to yield high performance operation. Due to the complexity of dynamically mapping many applications on a many-core system, a Distributed Run-Time Resource Management (DRTRM) framework is designed, implemented and evaluated on top of Intel SCC, an actual many-core NoC based computing platform. Motivated by the highly dynamic IoT environment, an additional analysis is performed to investigate the correlation of the arrival rate of incoming application requests and the effectiveness of DRTRM on allocating the available system resources. The analysis shows, that a fast and resource hungry scenario of incoming applications can be the breaking point for the effectiveness of DRTRM. Moreover, the enforcement of a relevant run-time mitigation scheme is complicated due to the distributed decision making, which requires the consensus of many agents, thus adding up to the required decision-making latency. This issue is mitigated by use of a Voltage and Frequency Scaling regulation policy, which indirectly slows down application admission, while requiring the cooperation of only a small subset of the agents of the system. The policy is implemented and evaluated on top of DRTRM, showing that it can relieve the congestion of applications under stressful conditions. The deep scaling of modern many-core systems, combined with the long operation cycles increase the probability of errors in their processing elements. Taking this into account, due to the importance of DRTRM for the operation of multiple IoT nodes and applications, SoftRM is introduced, a DRTRM augmented with fault tolerant features. The design of SoftRM relies on dynamic, workload-aware error mitigation and refrains from the provisioning of spare cores, via the self-organization of healthy agents in order to replace the failed ones. In addition, an error detection mechanism is implemented, which takes advantage of the communication patterns of DRTRM in order to reduce the overhead of error detection on the operation of healthy agents. Last, the concepts of distributed management utilized in DRTRM are extended to aid the negotiation of resources at Edge computing systems with multiple intermediate IoT Gateways. These distributed nodes, make use of trade-based mechanisms, in order to dynamically optimize the offered Service Quality to their subscribed IoT devices, while meeting their run-time constraints. These mechanisms allow to dynamically achieve more efficient binding of IoT devices to Gateways and thus fully exploit the resources of the latter in order to aid the operation of the first.
URI: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/17173
Appears in Collections:Διδακτορικές Διατριβές - Ph.D. Theses

Files in This Item:
File Description SizeFormat 
thesis_tsoutsouras.pdf19.25 MBAdobe PDFView/Open


Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.