Search CORE

155 research outputs found

Convergence Times of Decentralized Graph Coloring Algorithms

Author: de Supinski Paul B
Publication venue: Dartmouth Digital Commons
Publication date: 01/05/2019
Field of study

Ordinary graph coloring algorithms are nothing without their calculations, memorizations, and inter-vertex communications. We investigate a class of ultra simple algorithms which can find (Delta+1)-colorings despite drastic restrictions. For each procedure, conflicted vertices randomly recolor one at a time until the graph coloring is valid. We provide an array of run time bounds for these processes, including an O(n*log(Delta)) bound for a variant we propose, and an O(n*Delta) bound which applies to even the most adversarial scenarios

Dartmouth Digital Commons (Dartmouth College)

A Multilevel Approach to Topology-Aware Collective Operations in Computational Grids

Author: de Supinski B.
Foster I.
Gropp W.
Karonis N. T.
Lusk E.
Publication venue
Publication date: 01/01/2002
Field of study

The efficient implementation of collective communiction operations has received much attention. Initial efforts produced "optimal" trees based on network communication models that assumed equal point-to-point latencies between any two processes. This assumption is violated in most practical settings, however, particularly in heterogeneous systems such as clusters of SMPs and wide-area "computational Grids," with the result that collective operations perform suboptimally. In response, more recent work has focused on creating topology-aware trees for collective operations that minimize communication across slower channels (e.g., a wide-area network). While these efforts have significant communication benefits, they all limit their view of the network to only two layers. We present a strategy based upon a multilayer view of the network. By creating multilevel topology-aware trees we take advantage of communication cost differences at every level in the network. We used this strategy to implement topology-aware versions of several MPI collective operations in MPICH-G2, the Globus Toolkit[tm]-enabled version of the popular MPICH implementation of the MPI standard. Using information about topology provided by MPICH-G2, we construct these multilevel topology-aware trees automatically during execution. We present results demonstrating the advantages of our multilevel approach by comparing it to the default (topology-unaware) implementation provided by MPICH and a topology-aware two-layer implementation.Comment: 16 pages, 8 figure

arXiv.org e-Print Archive

CiteSeerX

Soft Error Vulnerability of Iterative Linear Algebra Methods

Author: Bronevetsky G
de Supinski B
Publication venue: Lawrence Livermore National Laboratory
Publication date: 15/12/2007
Field of study

Devices become increasingly vulnerable to soft errors as their feature sizes shrink. Previously, soft errors primarily caused problems for space and high-atmospheric computing applications. Modern architectures now use features so small at sufficiently low voltages that soft errors are becoming significant even at terrestrial altitudes. The soft error vulnerability of iterative linear algebra methods, which many scientific applications use, is a critical aspect of the overall application vulnerability. These methods are often considered invulnerable to many soft errors because they converge from an imprecise solution to a precise one. However, we show that iterative methods can be vulnerable to soft errors, with a high rate of silent data corruptions. We quantify this vulnerability, with algorithms generating up to 8.5% erroneous results when subjected to a single bit-flip. Further, we show that detecting soft errors in an iterative method depends on its detailed convergence properties and requires more complex mechanisms than simply checking the residual. Finally, we explore inexpensive techniques to tolerate soft errors in these methods

CiteSeerX

Crossref

UNT Digital Library

Soft Error Vulnerability of Iterative Linear Algebra Methods

Author: Bronevetsky G
de Supinski B
Publication venue: Lawrence Livermore National Laboratory
Publication date: 01/01/2008
Field of study

Devices are increasingly vulnerable to soft errors as their feature sizes shrink. Previously, soft error rates were significant primarily in space and high-atmospheric computing. Modern architectures now use features so small at sufficiently low voltages that soft errors are becoming important even at terrestrial altitudes. Due to their large number of components, supercomputers are particularly susceptible to soft errors. Since many large scale parallel scientific applications use iterative linear algebra methods, the soft error vulnerability of these methods constitutes a large fraction of the applications overall vulnerability. Many users consider these methods invulnerable to most soft errors since they converge from an imprecise solution to a precise one. However, we show in this paper that iterative methods are vulnerable to soft errors, exhibiting both silent data corruptions and poor ability to detect errors. Further, we evaluate a variety of soft error detection and tolerance techniques, including checkpointing, linear matrix encodings, and residual tracking techniques

Crossref

UNT Digital Library

Recommended from our members

Formal Specification of the OpenMP Memory Model

Author: Bronevetsky G
de Supinski B R
Publication venue: Lawrence Livermore National Laboratory
Publication date: 17/05/2006
Field of study

OpenMP [1] is an important API for shared memory programming, combining shared memory's potential for performance with a simple programming interface. Unfortunately, OpenMP lacks a critical tool for demonstrating whether programs are correct: a formal memory model. Instead, the current official definition of the OpenMP memory model (the OpenMP 2.5 specification [1]) is in terms of informal prose. As a result, it is impossible to verify OpenMP applications formally since the prose does not provide a formal consistency model that precisely describes how reads and writes on different threads interact. This paper focuses on the formal verification of OpenMP programs through a proposed formal memory model that is derived from the existing prose model [1]. Our formalization provides a two-step process to verify whether an observed OpenMP execution is conformant. In addition to this formalization, our contributions include a discussion of ambiguities in the current prose-based memory model description. Although our formal model may not capture the current informal memory model perfectly, in part due to these ambiguities, our model reflects our understanding of the informal model's intent. We conclude with several examples that may indicate areas of the OpenMP memory model that need further refinement however it is specified. Our goal is to motivate the OpenMP community to adopt those refinements eventually, ideally through a formal model, in later OpenMP specifications

UNT Digital Library

Recommended from our members

Practical Differential Profiling

Author: De Supinski B R
Schulz M
Publication venue: Lawrence Livermore National Laboratory
Publication date: 04/02/2007
Field of study

Comparing performance profiles from two runs is an essential performance analysis step that users routinely perform. In this work we present eGprof, a tool that facilitates these comparisons through differential profiling inside gprof. We chose this approach, rather than designing a new tool, since gprof is one of the few performance analysis tools accepted and used by a large community of users. eGprof allows users to 'subtract' two performance profiles directly. It also includes callgraph visualization to highlight the differences in graphical form. Along with the design of this tool, we present several case studies that show how eGprof can be used to find and to study the differences of two application executions quickly and hence can aid the user in this most common step in performance analysis. We do this without requiring major changes on the side of the user, the most important factor in guaranteeing the adoption of our tool by code teams

UNT Digital Library

Exploitation of Dynamic Communication Patterns through Static Analysis

Author: de Supinski B
Kranzlmueller D
Panas T
Preissl R
Quinlan D
Schulz M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/06/2010
Field of study

Abstract not provide

Crossref

UNT Digital Library

CLOMP: Accurately Characterizing OpenMP Application Overheads

Author: Bronevetsky G
de Supinski B
Gyllenhaal J
Publication venue: Lawrence Livermore National Laboratory
Publication date: 01/01/2008
Field of study

Despite its ease of use, OpenMP has failed to gain widespread use on large scale systems, largely due to its failure to deliver sufficient performance. Our experience indicates that the cost of initiating OpenMP regions is simply too high for the desired OpenMP usage scenario of many applications. In this paper, we introduce CLOMP, a new benchmark to characterize this aspect of OpenMP implementations accurately. CLOMP complements the existing EPCC benchmark suite to provide simple, easy to understand measurements of OpenMP overheads in the context of application usage scenarios. Our results for several OpenMP implementations demonstrate that CLOMP identifies the amount of work required to compensate for the overheads observed with EPCC. Further, we show that CLOMP also captures limitations for OpenMP parallelization on NUMA systems

CiteSeerX

Crossref

Springer - Publisher Connector

eScholarship - University of California

UNT Digital Library

Recommended from our members

Regression Strategies for Parameter Space Exploration: A Case Study in Semicoarsening Multigrid and R

Author: de Supinski B R
Lee B C
Schulz M
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 28/09/2006
Field of study

Increasing system and algorithmic complexity, combined with a growing number of tunable application parameters, pose significant challenges for analytical performance modeling. This report outlines a series of robust techniques that enable efficient parameter space exploration based on empirical statistical modeling. In particular, this report applies statistical techniques such as clustering, association, correlation analyses to understand the parameter space better. Results from these statistical techniques guide the construction of piecewise polynomial regression models. Residual and significance tests ensure the resulting model is unbiased and efficient. We demonstrate these techniques in R, a statistical computing environment, for predicting the performance of semicoarsening multigrid. 50 and 75 percent of predictions achieve error rates of 5.5 and 10.0 percent or less, respectively

UNT Digital Library

Asynchronous checkpoint migration with MRNet in the Scalable Checkpoint / Restart Library

Author: de Supinski B R
Mohror K
Moody A
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 20/03/2012
Field of study

Applications running on today's supercomputers tolerate failures by periodically saving their state in checkpoint files on stable storage, such as a parallel file system. Although this approach is simple, the overhead of writing the checkpoints can be prohibitive, especially for large-scale jobs. In this paper, we present initial results of an enhancement to our Scalable Checkpoint/Restart Library (SCR). We employ MRNet, a tree-based overlay network library, to transfer checkpoints from the compute nodes to the parallel file system asynchronously. This enhancement increases application efficiency by removing the need for an application to block while checkpoints are transferred to the parallel file system. We show that the integration of SCR with MRNet can reduce the time spent in I/O operations by as much as 15x. However, our experiments exposed new scalability issues with our initial implementation. We discuss the sources of the scalability problems and our plans to address them

Crossref

UNT Digital Library