64 research outputs found
Compilation techniques for irregular problems on parallel machines
Massively parallel computers have ushered in the era of teraflop computing. Even though large and powerful machines are being built, they are used by only a fraction of the computing community. The fundamental reason for this situation is that parallel machines are difficult to program. Development of compilers that automatically parallelize programs will greatly increase the use of these machines.;A large class of scientific problems can be categorized as irregular computations. In this class of computation, the data access patterns are known only at runtime, creating significant difficulties for a parallelizing compiler to generate efficient parallel codes. Some compilers with very limited abilities to parallelize simple irregular computations exist, but the methods used by these compilers fail for any non-trivial applications code.;This research presents development of compiler transformation techniques that can be used to effectively parallelize an important class of irregular programs. A central aim of these transformation techniques is to generate codes that aggressively prefetch data. Program slicing methods are used as a part of the code generation process. In this approach, a program written in a data-parallel language, such as HPF, is transformed so that it can be executed on a distributed memory machine. An efficient compiler runtime support system has been developed that performs data movement and software caching
Software Support for Irregular and Loosely Synchronous Problems
A large class of scientific and engineering applications may be classified as irregular and loosely synchronous from the perspective of parallel processing. We present a partial classification of such problems. This classification has motivated us to enhance Fortran D to provide language support for irregular, loosely synchronous problems. We present techniques for parallelization of such problems in the context of Fortran D
Software Support for Irregular and Loosely Synchronous Problems
A large class of scientific and engineering applications may be classified as irregular and loosely synchronous from the perspective of parallel processing. We present a partial classification of such problems. This classification has motivated us to enhance Fortran D to provide language support for irregular, loosely synchronous problems. We present techniques for parallelization of such problems in the context of Fortran D
Run-time and compile-time support for adaptive irregular problems
In adaptive irregular problems the data arrays are accessed via indirection arrays, and data access patterns change during computation. Implementing such problems on distributed memory machines requires support for dynamic data partitioning, efficient preprocessing and fast data migration. This research presents efficient runtime primitives for such problems. This new set of primitives is part of the CHAOS library. It subsumes the previous PARTI library which targeted only static irregular problems. To demonstrate the efficacy of the runtime support, two real adaptive irregular applications have been parallelized using CHAOS primitives: a molecular dynamics code (CHARMM) and a particle-in-cell code (DSMC). The paper also proposes extensions to Fortran D which can allow compilers to generate more efficient code for adaptive problems. These language extensions have been implemented in the Syracuse Fortran 90D/HPF prototype compiler. The performance of the compiler parallelized codes is compared with the hand parallelized versions
Run-time and Compile-time Support for Adaptive Irregular Problems
In adaptive irregular problems
the data arrays are accessed via indirection arrays,
and data access patterns change during
computation. Implementing such problems on distributed memory
machines requires support for dynamic data partitioning,
efficient preprocessing and fast data migration.
This research presents efficient runtime primitives for
such problems. This new set of primitives is part of the
CHAOS library. It subsumes the previous PARTI library which
targeted only static irregular problems.
To demonstrate the efficacy of the runtime support,
two real adaptive irregular
applications have been parallelized using CHAOS primitives:
a molecular dynamics code (CHARMM)
and a particle-in-cell code (DSMC).
The paper also proposes extensions to Fortran D
which can allow compilers to
generate more efficient code for adaptive problems.
These language extensions have been implemented
in the Syracuse Fortran 90D/HPF prototype compiler.
The performance of the compiler parallelized codes
is compared with the hand parallelized versions.
(Also cross-referenced as UMIACS-TR-94-55
Supporting Irregular Distributions Using Data-Parallel Languages
Languages such as Fortran D provide irregular distribution schemes that can efficiently support irregular problems. Irregular distributions can also be emulated in HPF. Compilers can incorporate runtime procedures to automatically support these distributions
Semiannual final report, 1 October 1991 - 31 March 1992
A summary of research conducted at the Institute for Computer Applications in Science and Engineering in applied mathematics, numerical analysis, and computer science during the period 1 Oct. 1991 through 31 Mar. 1992 is presented
Adaptive Runtime Support for Direct Simulation Monte Carlo Methods on Distributed Memory Architectures
In highly adaptive irregular problems such as many
Particle-In-Cell (PICJ codes and Dimet Simulation Monte Carlo (DSMCJ
codes, data access patterns may vary from time step to time step. This
fluctuation may hinder efficient utilization of distributed memory
parallel computers because of the resulting overhead for data
redistribution and dynamic load balancing. To efficiently parallelize
such adaptive irregular problems on distributed memory parallel
computers, several issues such as effective methods for domain
partitioning and fast data transportation must be addressed. This paper
presents efficient runtime support methods for such problems. A simple
one-dimensional domain partitioning method is implemented and compared
with unstructured mesh partitioners such as recursive coordinate
bisection and recursive inertial bisection. A remapping decision policy
has been investigated for dynamic load balancing on S-dimensional DSMC
codes. Performance results are presented
(Also cross-referenced as UMIACS-TR-95-27
- …