20,339 research outputs found
Instrumentation, performance visualization, and debugging tools for multiprocessors
The need for computing power has forced a migration from serial computation on a single processor to parallel processing on multiprocessor architectures. However, without effective means to monitor (and visualize) program execution, debugging, and tuning parallel programs becomes intractably difficult as program complexity increases with the number of processors. Research on performance evaluation tools for multiprocessors is being carried out at ARC. Besides investigating new techniques for instrumenting, monitoring, and presenting the state of parallel program execution in a coherent and user-friendly manner, prototypes of software tools are being incorporated into the run-time environments of various hardware testbeds to evaluate their impact on user productivity. Our current tool set, the Ames Instrumentation Systems (AIMS), incorporates features from various software systems developed in academia and industry. The execution of FORTRAN programs on the Intel iPSC/860 can be automatically instrumented and monitored. Performance data collected in this manner can be displayed graphically on workstations supporting X-Windows. We have successfully compared various parallel algorithms for computational fluid dynamics (CFD) applications in collaboration with scientists from the Numerical Aerodynamic Simulation Systems Division. By performing these comparisons, we show that performance monitors and debuggers such as AIMS are practical and can illuminate the complex dynamics that occur within parallel programs
A GPU Implementation for Two-Dimensional Shallow Water Modeling
In this paper, we present a GPU implementation of a two-dimensional shallow
water model. Water simulations are useful for modeling floods, river/reservoir
behavior, and dam break scenarios. Our GPU implementation shows vast
performance improvements over the original Fortran implementation. By taking
advantage of the GPU, researchers and engineers will be able to study water
systems more efficiently and in greater detail.Comment: 9 pages, 1 figur
Phenomenology Tools on Cloud Infrastructures using OpenStack
We present a new environment for computations in particle physics
phenomenology employing recent developments in cloud computing. On this
environment users can create and manage "virtual" machines on which the
phenomenology codes/tools can be deployed easily in an automated way. We
analyze the performance of this environment based on "virtual" machines versus
the utilization of "real" physical hardware. In this way we provide a
qualitative result for the influence of the host operating system on the
performance of a representative set of applications for phenomenology
calculations.Comment: 25 pages, 12 figures; information on memory usage included, as well
as minor modifications. Version to appear in EPJ
DART-MPI: An MPI-based Implementation of a PGAS Runtime System
A Partitioned Global Address Space (PGAS) approach treats a distributed
system as if the memory were shared on a global level. Given such a global view
on memory, the user may program applications very much like shared memory
systems. This greatly simplifies the tasks of developing parallel applications,
because no explicit communication has to be specified in the program for data
exchange between different computing nodes. In this paper we present DART, a
runtime environment, which implements the PGAS paradigm on large-scale
high-performance computing clusters. A specific feature of our implementation
is the use of one-sided communication of the Message Passing Interface (MPI)
version 3 (i.e. MPI-3) as the underlying communication substrate. We evaluated
the performance of the implementation with several low-level kernels in order
to determine overheads and limitations in comparison to the underlying MPI-3.Comment: 11 pages, International Conference on Partitioned Global Address
Space Programming Models (PGAS14
- …