Search CORE

46 research outputs found

Evaluation der Leistungsfähigkeit von gemischt-parallelen Programmen in homogenen und heterogenen Umgebungen unter Berücksichtigung effizienter Schedulingstrategien

Author: Hunold Sascha
Publication venue
Publication date: 01/01/2008
Field of study

Jedule: A Tool for Visualizing Schedules of Parallel Applications

Author: Hoffmann Ralf
Hunold Sascha
Suter Frédéric
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2010
Field of study

International audienceTask scheduling is one of the most prominent problems in the era of parallel computing. We find scheduling algorithms in every domain of computer science, e.g., mapping multiprocessor tasks to clusters, mapping jobs to grid resources, or mapping fine-grained tasks to cores of multicore processors. Many tools exist that help understand or debug an application by presenting visual representations of a certain program run, e.g., visualizations of MPI traces. However, often developers want to get a global and abstract view of their schedules first. In this paper we introduce Jedule, a tool dedicated to visualize schedules of parallel applications. We demonstrate the effectiveness of Jedule by showing how it helped analyzing problems in several case studies

HAL-IN2P3

Crossref

From Simulation to Experiment: A Case Study on Multiprocessor Task Scheduling

Author: Casanova Henri
Hunold Sascha
Suter Frederic
Publication venue: HAL CCSD
Publication date: 16/05/2011
Field of study

International audienceSimulation is a popular approach for empirically evaluating the performance of algorithms and applications in the parallel computing domain. Most published works present results without quantifying simulation error. In this work we investigate accuracy issues when simulating the execution of parallel applications. This is a broad question, and we focus on a relevant case study: the evaluation of scheduling algorithms for executing mixed-parallel applications on clusters. Most such scheduling algorithms have been evaluated in simulation only. We compare simulations to real-world experiments in a view to identify which features of a simulator are most critical for simulation accuracy. Our first finding is that simple yet popular analytical simulation models lead to simulation results that cannot be used for soundly comparing scheduling algorithms. We then show that, by contrast, simulation models instantiated based on brute-force measurements of the target execution environment lead to usable results. Finally, we develop empirical simulation models that provide a reasonable compromise between the two previous approaches

HAL-IN2P3

Crossref

Hal - Université Grenoble Alpes

Efficient Process-to-Node Mapping Algorithms for Stencil Computations

Author: Hunold Sascha
Lehr Markus
Schulz Christian
Träff Jesper Larsson
von Kirchbach Konrad
Publication venue
Publication date: 20/05/2020
Field of study

Good process-to-compute-node mappings can be decisive for well performing HPC applications. A special, important class of process-to-node mapping problems is the problem of mapping processes that communicate in a sparse stencil pattern to Cartesian grids. By thoroughly exploiting the inherently present structure in this type of problem, we devise three novel distributed algorithms that are able to handle arbitrary stencil communication patterns effectively. We analyze the expected performance of our algorithms based on an abstract model of inter- and intra-node communication. An extensive experimental evaluation on several HPC machines shows that our algorithms are up to two orders of magnitude faster in running time than a (sequential) high-quality general graph mapping tool, while obtaining similar results in communication performance. Furthermore, our algorithms also achieve significantly better mapping quality compared to previous state-of-the-art Cartesian grid mapping algorithms. This results in up to a threefold performance improvement of an MPI_Neighbor_alltoall exchange operation. Our new algorithms can be used to implement the MPI_Cart_create functionality.Comment: 18 pages, 9 Figure

arXiv.org e-Print Archive

Crossref