Please use this identifier to cite or link to this item:
Title: Performance Monitoring And Workload Characterization Of Big Data And Cloud Based Applications On The Intel Scc Manycore Platform
Authors: Ανδρέας - Λάζαρος Γεωργιάδης
Σούντρης Δημήτριος
Keywords: big data
embedded systems
distributed systems
computer architecture
Issue Date: 29-Apr-2015
Abstract: The scope of this Diploma Thesis is to explore several performance, power consumption and scalability aspects of the execution of Big Data and Cloud Based workloads on the Intel Single-chip Cloud Computer Manycore Platform, which differentiates from typical cluster topologies, since it integrates 48 cores on a single chip. The applications we study are implemented using the MapReduce framework on top of the Hadoop Distributed File System. For the purpose of this analysis we have developed a runtime monitoring infrastructure which utilizes Ganglia, a monitoring tool for large clusters.Chapter 1 initially states the importance of studying Cloud Computing and Big Data Applications and presents some basic aspects of the concepts this diploma thesis deals with. This chapter concludes with the contribution this thesis attempts to make in the field of scale-out applications and many-core systems.Chapter 2 describes recent research findings in the related fields of scale-out workloads and performance and power monitoring of the Intel SCC that have provided the background and inspiration for this diploma thesis.Chapter 3 describes the architecture of the Intel SCC in detail, emphasizing on aspects of the platform whose understanding is crucial for application behavior characterization.Chapter 4 presents a detailed analysis of the Hadoop Distributed File System and the MapReduce framework, by discussing key implementation aspects and providing guidelines of how to configure an HDFS cluster installation and tune the execution of MapReduce jobs.Chapter 5 provides a detailed description of the tools that have been used and developed so as to deploy and launch Hadoop Clusters on the Intel SCC. The Runtime Environment setup and the Hadoop Cluster installation processes are described and explained in detail.Chapter 6 presents the Runtime Monitoring Framework we have developed for the Intel SCC. The Ganglia Cluster topology we have configured for the Intel SCC is analyzed and the process of collecting, storing and visualizing runtime metrics is explained.Chapter 7 describes and explains the experimental analysis we have conducted for four MapReduce applications when they run on the Intel SCC. Our investigation is focused on the behavior of those applications for varying input sizes, HDFS cluster topologies and frequency settings for the cluster nodes.Chapter 8 concludes the findings of this diploma thesis and presents suggestions for future work.
Appears in Collections:Διπλωματικές Εργασίες - Theses

Files in This Item:
File SizeFormat 
DT2015-0108.pdf50.18 MBAdobe PDFView/Open

Items in Artemis are protected by copyright, with all rights reserved, unless otherwise indicated.