Search CORE

34 research outputs found

Task mapping on a dragonfly supercomputer

Author: Coskun Ayse K.
Leung Vitus
Tuncer Ozan
Zhang Yijia
Publication venue
Publication date: 14/09/2017
Field of study

The dragonfly network topology has recently gained traction in the design of high performance computing (HPC) systems and has been implemented in large-scale supercomputers. The impact of task mapping, i.e., placement of MPI ranks onto compute cores, on the communication performance of applications on dragonfly networks has not been comprehensively investigated on real large-scale systems. This paper demonstrates that task mapping affects the communication overhead significantly in dragonflies and the magnitude of this effect is sensitive to the application, job size, and the OpenMP settings. Among the three task mapping algorithms we study (in-order, random, and recursive coordinate bisection), selecting a suitable task mapper reduces application communication time by up to 47%

Boston University Institutional Repository (OpenBU)

ALBADross: active learning based anomaly diagnosis for production HPC systems

Author: Aaziz Omar
Aksar Burak
Brandt Jim
Coskun Ayse K.
Kulis Brian
Leung Vitus J.
Schwaller Benjamin
Sencan Efe
Publication venue: IEEE
Publication date: 15/02/2023
Field of study

000000000000000000000000000000000000000000000000000002263712 - Sandia National Laboratories; Sandia National LaboratoriesAccepted manuscrip

Boston University Institutional Repository (OpenBU)

Recommended from our members

Parallel job scheduling policies to improve fairness : a case study.

Author: Leung Vitus Joseph
Sabin Gerald (Ohio State University, Columbus, OH)
Sadayappan Ponnuswamy (Ohio State University, Columbus, OH)
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 01/02/2008
Field of study

Balancing fairness, user performance, and system performance is a critical concern when developing and installing parallel schedulers. Sandia uses a customized scheduler to manage many of their parallel machines. A primary function of the scheduler is to ensure that the machines have good utilization and that users are treated in a 'fair' manner. A separate compute process allocator (CPA) ensures that the jobs on the machines are not too fragmented in order to maximize throughput. Until recently, there has been no established technique to measure the fairness of parallel job schedulers. This paper introduces a 'hybrid' fairness metric that is similar to recently proposed metrics. The metric uses the Sandia version of a 'fairshare' queuing priority as the basis for fairness. The hybrid fairness metric is used to evaluate a Sandia workload. Using these results, multiple scheduling strategies are introduced to improve performance while satisfying user and system performance constraints

UNT Digital Library

Task mapping for non-contiguous allocations.

Author: Bunde David P.
Ebbers Johnathan
Feer Stefan P.
Leung Vitus Joseph
Price Nicholas W.
Rhodes Zachary D.
Swank Matthew
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 01/02/2013
Field of study

This paper examines task mapping algorithms for non-contiguously allocated parallel jobs. Several studies have shown that task placement affects job running time for both contiguously and non-contiguously allocated jobs. Traditionally, work on task mapping either uses a very general model where the job has an arbitrary communication pattern or assumes that jobs are allocated contiguously, making them completely isolated from each other. A middle ground between these two cases is the mapping problem for non-contiguous jobs having a specific communication pattern. We propose several task mapping algorithms for jobs with a stencil communication pattern and evaluate them using experiments and simulations. Our strategies improve the running time of a MiniApp by as much as 30% over a baseline strategy. Furthermore, this improvement increases markedly with the job size, demonstrating the importance of task mapping as systems grow toward exascale

Crossref

UNT Digital Library

Recommended from our members

Algorithmic support for commodity-based parallel computing systems.

Author: Bender Michael A. (State University of New York, Stony Brook, NY)
Bunde David P. (University of Illinois, Urbna, IL)
Leung Vitus Joseph
Phillips Cynthia Ann
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 01/10/2003
Field of study

The Computational Plant or Cplant is a commodity-based distributed-memory supercomputer under development at Sandia National Laboratories. Distributed-memory supercomputers run many parallel programs simultaneously. Users submit their programs to a job queue. When a job is scheduled to run, it is assigned to a set of available processors. Job runtime depends not only on the number of processors but also on the particular set of processors assigned to it. Jobs should be allocated to localized clusters of processors to minimize communication costs and to avoid bandwidth contention caused by overlapping jobs. This report introduces new allocation strategies and performance metrics based on space-filling curves and one dimensional allocation strategies. These algorithms are general and simple. Preliminary simulations and Cplant experiments indicate that both space-filling curves and one-dimensional packing improve processor locality compared to the sorted free list strategy previously used on Cplant. These new allocation strategies are implemented in Release 2.0 of the Cplant System Software that was phased into the Cplant systems at Sandia by May 2002. Experimental results then demonstrated that the average number of communication hops between the processors allocated to a job strongly correlates with the job's completion time. This report also gives processor-allocation algorithms for minimizing the average number of communication hops between the assigned processors for grid architectures. The associated clustering problem is as follows: Given n points in {Re}d, find k points that minimize their average pairwise L{sub 1} distance. Exact and approximate algorithms are given for these optimization problems. One of these algorithms has been implemented on Cplant and will be included in Cplant System Software, Version 2.1, to be released. In more preliminary work, we suggest improvements to the scheduler separate from the allocator

UNT Digital Library

Processor allocation on Cplant: Achieving general processor locality using one-dimensional allocation strategies.

Author: Alok Lal
Cynthia A Phillips
David P Bunde
Esther M Arkin
Jeanette R Johnston
Joseph S B Mitchell
Michael A Bender
Steven S Seiden
Vitus J Leung
Publication venue
Publication date: 01/01/2002
Field of study

Abstract Follows 3 Abstract The Computational Plant or Cplant is a commodity-based supercomputer under development at Sandia National Laboratories. This paper describes resource-allocation strategies to achieve processor locality for parallel jobs in Cplant and other supercomputers. Users of Cplant and other Sandia supercomputers submit parallel jobs to a job queue. When a job is scheduled to run, it is assigned to a set of processors. To obtain maximum throughput, jobs should be allocated to localized clusters of processors to minimize communication costs and to avoid bandwidth contention caused by overlapping jobs. This paper introduces new allocation strategies and performance metrics based on space-filling curves and one dimensional allocation strategies. These algorithms are general and simple. Preliminary simulations and Cplant experiments indicate that both space-filling curves and one-dimensional packing improve processor locality compared to the sorted free list strategy previously used on Cplant. These new allocation strategies are implemented in the new release of the Cplant System Software, Version 2.0, phased into th

CiteSeerX

UNT Digital Library

Trends in Data Locality Abstractions for HPC Systems

Author: Amir Kamil
Anshu Dubey
Bradford L. Chamberlain
Chris J. Newburn
Didem Unat
Emmanuel Jeannot
Frank Hannig
H. Carter Edwards
Hal Finkel
Hatem Ltaief
Jeff Keasler
John Shalf
Karl Fuerlinger
Mark Abraham
Mauro Bianco
Miquel Pericas
Naoya Maruyama
Paul H J Kelly
Romain Cledat
Torsten Hoefler
Vitus Leung
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref