Παρακαλώ χρησιμοποιήστε αυτό το αναγνωριστικό για να παραπέμψετε ή να δημιουργήσετε σύνδεσμο προς αυτό το τεκμήριο: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/9098
Τίτλος: Communication Performance Prediction On Large-scale Systems
Συγγραφείς: Nikela Papadopoulou
Λέξεις κλειδιά: performance modeling
predictive modeling
communication time
hpc applications
mpi
supercomputers
clusters
statistical learning
machine learning
Ημερομηνία έκδοσης: 16-Οκτ-2017
Περίληψη: On the path to exascale, supercomputers will grow to host hundreds of millions of cores and various complex heterogeneous processing elements, yet even today, users fail to leverage the existing compute power of large-scale systems, as large classes of typical HPC applications are bound by non-scalable communication phases. The ability to predict the communication time of parallel applications can assist users, compilers, runtime systems and schedulers with decision-making for optimal resource utilization, performance optimizations, power saving and resilience.This thesis presents a methodology for predictive communication modeling of HPC applications. Communication time depends on a complex set of parameters, relevant to the application, the system architecture, the runtime configuration and runtime conditions. To handle this complexity, we follow an empirical modeling approach. We define features that can be extracted from the application, the process mapping and the allocation shape ahead of execution, deploy a single benchmark to sweep over the parameter space and develop predictive models for communication time on three large-scale computing systems, Vilje, Piz Daint and ARIS, using different subsets of our features, statistical and machine-learning methods and training sets. We compare the predictive performance of our models on various communication patterns and applications, for multiple problem sizes, executions and runtime configurations, ranging from a few dozen to a few thousand cores. Our methodology is successful across all tested communication patterns on all systems and exhibits high prediction accuracy and goodness-of-fit. Our models are applicable just-in-time ahead of the execution of an HPC application, and, as we demonstrate in this thesis, their high accuracy make them suitable for communication-aware decision making, towards the optimization of resource utilization on large-scale systems.
URI: http://artemis-new.cslab.ece.ntua.gr:8080/jspui/handle/123456789/9098
Εμφανίζεται στις συλλογές:Διδακτορικές Διατριβές - Ph.D. Theses

Αρχεία σε αυτό το τεκμήριο:
Αρχείο ΜέγεθοςΜορφότυπος 
PD2017-0030.pdf11.32 MBAdobe PDFΕμφάνιση/Άνοιγμα


Όλα τα τεκμήρια του δικτυακού τόπου προστατεύονται από πνευματικά δικαιώματα.