1,511 research outputs found
Parallel Implementation of the PHOENIX Generalized Stellar Atmosphere Program
We describe the parallel implementation of our generalized stellar atmosphere
and NLTE radiative transfer computer program PHOENIX. We discuss the parallel
algorithms we have developed for radiative transfer, spectral line opacity, and
NLTE opacity and rate calculations. Our implementation uses a MIMD design based
on a relatively small number of MPI library calls. We report the results of
test calculations on a number of different parallel computers and discuss the
results of scalability tests.Comment: To appear in ApJ, 1997, vol 483. LaTeX, 34 pages, 3 Figures, uses
AASTeX macros and styles natbib.sty, and psfig.st
Parallel Implementation of the PHOENIX Generalized Stellar Atmosphere Program. II: Wavelength Parallelization
We describe an important addition to the parallel implementation of our
generalized NLTE stellar atmosphere and radiative transfer computer program
PHOENIX. In a previous paper in this series we described data and task parallel
algorithms we have developed for radiative transfer, spectral line opacity, and
NLTE opacity and rate calculations. These algorithms divided the work spatially
or by spectral lines, that is distributing the radial zones, individual
spectral lines, or characteristic rays among different processors and employ,
in addition task parallelism for logically independent functions (such as
atomic and molecular line opacities). For finite, monotonic velocity fields,
the radiative transfer equation is an initial value problem in wavelength, and
hence each wavelength point depends upon the previous one. However, for
sophisticated NLTE models of both static and moving atmospheres needed to
accurately describe, e.g., novae and supernovae, the number of wavelength
points is very large (200,000--300,000) and hence parallelization over
wavelength can lead both to considerable speedup in calculation time and the
ability to make use of the aggregate memory available on massively parallel
supercomputers. Here, we describe an implementation of a pipelined design for
the wavelength parallelization of PHOENIX, where the necessary data from the
processor working on a previous wavelength point is sent to the processor
working on the succeeding wavelength point as soon as it is known. Our
implementation uses a MIMD design based on a relatively small number of
standard MPI library calls and is fully portable between serial and parallel
computers.Comment: AAS-TeX, 15 pages, full text with figures available at
ftp://calvin.physast.uga.edu/pub/preprints/Wavelength-Parallel.ps.gz ApJ, in
pres
Scalable Parallel Computers for Real-Time Signal Processing
We assess the state-of-the-art technology in massively parallel processors (MPPs) and their variations in different architectural platforms. Architectural and programming issues are identified in using MPPs for time-critical applications such as adaptive radar signal processing. We review the enabling technologies. These include high-performance CPU chips and system interconnects, distributed memory architectures, and various latency hiding mechanisms. We characterize the concept of scalability in three areas: resources, applications, and technology. Scalable performance attributes are analytically defined. Then we compare MPPs with symmetric multiprocessors (SMPs) and clusters of workstations (COWs). The purpose is to reveal their capabilities, limits, and effectiveness in signal processing. We evaluate the IBM SP2 at MHPCC, the Intel Paragon at SDSC, the Gray T3D at Gray Eagan Center, and the Gray T3E and ASCI TeraFLOP system proposed by Intel. On the software and programming side, we evaluate existing parallel programming environments, including the models, languages, compilers, software tools, and operating systems. Some guidelines for program parallelization are provided. We examine data-parallel, shared-variable, message-passing, and implicit programming models. Communication functions and their performance overhead are discussed. Available software tools and communication libraries are also introducedpublished_or_final_versio
S-Store: Streaming Meets Transaction Processing
Stream processing addresses the needs of real-time applications. Transaction
processing addresses the coordination and safety of short atomic computations.
Heretofore, these two modes of operation existed in separate, stove-piped
systems. In this work, we attempt to fuse the two computational paradigms in a
single system called S-Store. In this way, S-Store can simultaneously
accommodate OLTP and streaming applications. We present a simple transaction
model for streams that integrates seamlessly with a traditional OLTP system. We
chose to build S-Store as an extension of H-Store, an open-source, in-memory,
distributed OLTP database system. By implementing S-Store in this way, we can
make use of the transaction processing facilities that H-Store already
supports, and we can concentrate on the additional implementation features that
are needed to support streaming. Similar implementations could be done using
other main-memory OLTP platforms. We show that we can actually achieve higher
throughput for streaming workloads in S-Store than an equivalent deployment in
H-Store alone. We also show how this can be achieved within H-Store with the
addition of a modest amount of new functionality. Furthermore, we compare
S-Store to two state-of-the-art streaming systems, Spark Streaming and Storm,
and show how S-Store matches and sometimes exceeds their performance while
providing stronger transactional guarantees
Scalable Parallel Numerical CSP Solver
We present a parallel solver for numerical constraint satisfaction problems
(NCSPs) that can scale on a number of cores. Our proposed method runs worker
solvers on the available cores and simultaneously the workers cooperate for the
search space distribution and balancing. In the experiments, we attained up to
119-fold speedup using 256 cores of a parallel computer.Comment: The final publication is available at Springe
MatlabMPI
The true costs of high performance computing are currently dominated by
software. Addressing these costs requires shifting to high productivity
languages such as Matlab. MatlabMPI is a Matlab implementation of the Message
Passing Interface (MPI) standard and allows any Matlab program to exploit
multiple processors. MatlabMPI currently implements the basic six functions
that are the core of the MPI point-to-point communications standard. The key
technical innovation of MatlabMPI is that it implements the widely used MPI
``look and feel'' on top of standard Matlab file I/O, resulting in an extremely
compact (~250 lines of code) and ``pure'' implementation which runs anywhere
Matlab runs, and on any heterogeneous combination of computers. The performance
has been tested on both shared and distributed memory parallel computers (e.g.
Sun, SGI, HP, IBM, Linux and MacOSX). MatlabMPI can match the bandwidth of C
based MPI at large message sizes. A test image filtering application using
MatlabMPI achieved a speedup of ~300 using 304 CPUs and ~15% of the theoretical
peak (450 Gigaflops) on an IBM SP2 at the Maui High Performance Computing
Center. In addition, this entire parallel benchmark application was implemented
in 70 software-lines-of-code, illustrating the high productivity of this
approach. MatlabMPI is available for download on the web
(www.ll.mit.edu/MatlabMPI).Comment: Download software from http://www.ll.mit.edu/MatlabMPI, 12 pages
including 7 color figures; submitted to the Journal of Parallel and
Distributed Computin
- …