1,086 research outputs found
Managing Communication Latency-Hiding at Runtime for Parallel Programming Languages and Libraries
This work introduces a runtime model for managing communication with support
for latency-hiding. The model enables non-computer science researchers to
exploit communication latency-hiding techniques seamlessly. For compiled
languages, it is often possible to create efficient schedules for
communication, but this is not the case for interpreted languages. By
maintaining data dependencies between scheduled operations, it is possible to
aggressively initiate communication and lazily evaluate tasks to allow maximal
time for the communication to finish before entering a wait state. We implement
a heuristic of this model in DistNumPy, an auto-parallelizing version of
numerical Python that allows sequential NumPy programs to run on distributed
memory architectures. Furthermore, we present performance comparisons for eight
benchmarks with and without automatic latency-hiding. The results shows that
our model reduces the time spent on waiting for communication as much as 27
times, from a maximum of 54% to only 2% of the total execution time, in a
stencil application.Comment: PREPRIN
Parallel Implementation of the PHOENIX Generalized Stellar Atmosphere Program
We describe the parallel implementation of our generalized stellar atmosphere
and NLTE radiative transfer computer program PHOENIX. We discuss the parallel
algorithms we have developed for radiative transfer, spectral line opacity, and
NLTE opacity and rate calculations. Our implementation uses a MIMD design based
on a relatively small number of MPI library calls. We report the results of
test calculations on a number of different parallel computers and discuss the
results of scalability tests.Comment: To appear in ApJ, 1997, vol 483. LaTeX, 34 pages, 3 Figures, uses
AASTeX macros and styles natbib.sty, and psfig.st
Longitudinal Phase Space Tomography with Space Charge
Tomography is now a very broad topic with a wealth of algorithms for the reconstruction of both qualitative and quantitative images. In an extension in the domain of particle accelerators, one of the simplest algorithms has been modified to take into account the non-linearity of large-amplitude synchrotron motion. This permits the accurate reconstruction of longitudinal phase space density from one-dimensional bunch profile data. The method is a hybrid one which incorporates particle tracking. Hitherto, a very simple tracking algorithm has been employed because only a brief span of measured profile data is required to build a snapshot of phase space. This is one of the strengths of the method, as tracking for relatively few turns relaxes the precision to which input machine parameters need to be known. The recent addition of longitudinal space charge considerations as an optional refinement of the code is described. Simplicity suggested an approach based on the derivative of bunch shape with the properties of the vacuum chamber parametrized by a single value of distributed reactive impedance and by a geometrical coupling coefficient. This is sufficient to model the dominant collective effects in machines of low to moderate energy. In contrast to simulation codes, binning is not an issue since the profiles to be differentiated are measured ones. The program is written in Fortran 90 with High-Performance Fortran (HPF) extensions for parallel processing. A major effort has been made to identify and remove execution bottlenecks, for example by reducting floating-point calculations and recoding slow intrinsic functions. A pointer-like mechanism which avoids the problems associated with pointers and parallel processing has been implemented. This is required to handle the large, sparse matrices that the algorithm employs. Results obtained with and without the inclusion of space charge are presented and compared for proton beams in the CERN PS Booster. Comparisons of execution times on different platforms are presented and the chosen solution for our application program, which uses a dual processor PC for the number crunching, is described
Tomographic Measurements of Longitudinal Phase Space Density
Tomography : the reconstruction of a two-dimensional image from a series of its one-dimensional projections is now a very broad topic with a wealth of algorithms for the reconstruction of both qualitative and quantitative images. One of the simplest algorithms has been modified to take into account the non-linearity of large-amplitude synchrotron motion in a particle accelerator. This permits the accurate reconstruction of longitudinal phase space density from one-dimensional bunch profile data. The algorithm was developed in Mathematica TM in order to exploit the extensive built-in functions and graphics. Subsequently, it has been recoded in Fortran 90 with the aim of reducing the execution time by at least a factor of one hundred. The choice of Fortran 90 was governed by the desire ultimately to exploit parallel architectures, but sequential compilation and execution have already largely yielded the required gain in speed. The use of the method to produce longitudinal phase space plots, animated sequences of the evolution of phase space density and to estimate accelerator parameters is presented. More generally, the new algorithm constitutes an extension of computerized tomography which caters for non rigid bodies whose projections cannot be measured simultaneously
Experimental and Numerical Study of a Mobile Reversible Air Conditioning-Heat Pump System
Electric vehicles suffer from range anxiety, while traditional resistive heating consumes a lot of electric energy and reduces EV drive range largely. Mobile reversible air conditioning-heat pump system is an energy efficient way of providing heat to EV cabin climate. In this paper, an AC/HP system was built based on the Nissan Leaf system configuration and experimentally studied. This system consists of three heat exchangers, an open-shaft compressor, two expansion valves, and two flow control valves. Heating performance of the system under various operating conditions was extensively investigated. Controlling subcooling was found a beneficial way of obtaining higher energy efficiency. Refrigerant charge imbalance when switching modes was found to be a challenge, and was studied both experimentally and numerically. Careful positioning of expansion valves, and sizing of liquid lines in both modes are essential in avoiding large charge imbalance. Component wise, the outdoor heat exchanger holds much more charge in AC mode than in HP mode. A steady state simulation model of the components and the system was developed and reasonably validated against experimental data. Options for improvement of the system based on modeling prediction were provided and discussed
- …