72 research outputs found
ReSHAPE: A Framework for Dynamic Resizing and Scheduling of Homogeneous Applications in a Parallel Environment
Applications in science and engineering often require huge computational
resources for solving problems within a reasonable time frame. Parallel
supercomputers provide the computational infrastructure for solving such
problems. A traditional application scheduler running on a parallel cluster
only supports static scheduling where the number of processors allocated to an
application remains fixed throughout the lifetime of execution of the job. Due
to the unpredictability in job arrival times and varying resource requirements,
static scheduling can result in idle system resources thereby decreasing the
overall system throughput. In this paper we present a prototype framework
called ReSHAPE, which supports dynamic resizing of parallel MPI applications
executed on distributed memory platforms. The framework includes a scheduler
that supports resizing of applications, an API to enable applications to
interact with the scheduler, and a library that makes resizing viable.
Applications executed using the ReSHAPE scheduler framework can expand to take
advantage of additional free processors or can shrink to accommodate a high
priority application, without getting suspended. In our research, we have
mainly focused on structured applications that have two-dimensional data arrays
distributed across a two-dimensional processor grid. The resize library
includes algorithms for processor selection and processor mapping. Experimental
results show that the ReSHAPE framework can improve individual job turn-around
time and overall system throughput.Comment: 15 pages, 10 figures, 5 tables Submitted to International Conference
on Parallel Processing (ICPP'07
Explicit Parallel Programming: System Description
The implementation of the Explicit Parallel Programming (EPP) system is described. EPP is a prototype implementation of a language for writing parallel programs for shared memory multiprocessors. EPP may be viewed as a coordination language, since it is used to define the sequencing or ordering of various tasks, while the tasks themselves are defined in some other compilable language. The two main components of the EPP system---a compiler and an executive---are described in this report. An appendix is included which contains the grammar defining the EPP language as well as templates used by the compiler in code generation
Data and Activity Representation for Grid Computing
Computational grids are becoming increasingly popular as an infrastructure for computa-
tional science research. The demand for high-level tools and problem solving environments has
prompted active research in Grid Computing Environments (GCEs). Many GCEs have been
one-o development eorts. More recently, there have been many eorts to dene component ar-
chitectures for constructing important pieces of a GCE. This paper examines another approach,
based on a `data-centric' framework for building powerful, context-aware GCEs spanning mul-
tiple layers of abstraction. We describe a scheme for representing data and activities in a GCE
and outline various tools under development which use this representation
ScALPEL: A Scalable Adaptive Lightweight Performance Evaluation Library for application performance monitoring
As supercomputers continue to grow in scale and capabilities, it is becoming
increasingly difficult to isolate processor and system level causes of
performance degradation. Over the last several years, a significant number of
performance analysis and monitoring tools have been built/proposed. However,
these tools suffer from several important shortcomings, particularly in
distributed environments. In this paper we present ScALPEL, a Scalable Adaptive
Lightweight Performance Evaluation Library for application performance
monitoring at the functional level. Our approach provides several distinct
advantages. First, ScALPEL is portable across a wide variety of architectures,
and its ability to selectively monitor functions presents low run-time
overhead, enabling its use for large-scale production applications. Second, it
is run-time configurable, enabling both dynamic selection of functions to
profile as well as events of interest on a per function basis. Third, our
approach is transparent in that it requires no source code modifications.
Finally, ScALPEL is implemented as a pluggable unit by reusing existing
performance monitoring frameworks such as Perfmon and PAPI and extending them
to support both sequential and MPI applications.Comment: 10 pages, 4 figures, 2 table
Hodiex: A Sixth Order Accurate Method for Solving Elliptical PDEs
This paper describes a method for discretizing general linear two dimensional elliptical PDEs with variable coefficients, Lu=g, which achieves high orders of accuracy on an extended range of problems. The method can be viewed as an extension of the ELLPACK6 discretization module HODIE ("High Order Difference Approximation with Identity Expansion"), which achieves high orders of accuracy on a more limited class of problems. We thus call this method HODIEX. An advantage of HODIEX methods, including the one described here, is that they are based on a compact 9-point stencil which yields linear systems with a smaller bandwidth than if a larger stencil were used to achieve higher accuracy
On Parallel ELLPACK for Shared Memory Machines
This report outlines design goals and criteria for a new version of ELLPACK for shared memory multiprocessors. We discuss several specific issues in detail, and suggest paths for further work. An example is given which illustrates some of the tradeoffs necessary in parallelizing a large and complicated mathematical software package such as ELLPACK. We also raise several important questions that must be dealt with as packages such as ELLPACK evolve along with modern high-performance architectures
Optimization by nonhierarchical asynchronous decomposition
Large scale optimization problems are tractable only if they are somehow decomposed. Hierarchical decompositions are inappropriate for some types of problems and do not parallelize well. Sobieszczanski-Sobieski has proposed a nonhierarchical decomposition strategy for nonlinear constrained optimization that is naturally parallel. Despite some successes on engineering problems, the algorithm as originally proposed fails on simple two dimensional quadratic programs. The algorithm is carefully analyzed for quadratic programs, and a number of modifications are suggested to improve its robustness
Toward Parallel Mathematical Software for Elliptic Partial Differential Equations
Three approaches to parallelizing important components of the mathematical software package ELLPACK are considered: an explicit approach using compiler directives available only on the target machine, an automatic approach using an optimizing and parallelizing precompiler, and a two-level approach based on extensive use of a set of low level computational kernels. The focus is on shared memory architectures. Each approach to parallelization is described in detail, along with a discussion of the effort involved. Performance on a test problem, using up to sixteen processors of a Sequent Symmetry S81, is reported and discussed. Implications for the parallelization of a broad class of mathematical software are drawn
Shortening Time-to-Discovery with Dynamic Software Updates for Parallel High Performance Applications
Despite using multiple concurrent processors, a typical high performance parallel application is long-running, taking hours, even days to arrive at a solution. To modify a running high performance parallel application, the programmer has to stop the computation, change the code, redeploy, and enqueue the updated version to be scheduled to run, thus wasting not only the programmer’s time, but also expensive computing resources. To address these inefficiencies, this article describes how dynamic software updates can be used to modify a parallel application on the fly, thus saving the programmer’s time and using expensive computing resources more productively. The net effect of updating parallel applications dynamically reduces their time-to-discovery metrics, the total time it takes from posing a problem to arriving at a solution. To explore the benefits of dynamic updates for high performance applications, this article takes a two-pronged approach. First, we describe our experience in building and evaluating a system for dynamically updating applications running on a parallel cluster. We then review a large body of literature describing the existing state of the art in dynamic software updates and point out how this research can be applied to high performance applications. Our experimental results indicate that dynamic software updates have the potential to become a powerful tool in reducing the time-to-discovery metrics for high performance parallel applications
- …