20,339 research outputs found

    Instrumentation, performance visualization, and debugging tools for multiprocessors

    Get PDF
    The need for computing power has forced a migration from serial computation on a single processor to parallel processing on multiprocessor architectures. However, without effective means to monitor (and visualize) program execution, debugging, and tuning parallel programs becomes intractably difficult as program complexity increases with the number of processors. Research on performance evaluation tools for multiprocessors is being carried out at ARC. Besides investigating new techniques for instrumenting, monitoring, and presenting the state of parallel program execution in a coherent and user-friendly manner, prototypes of software tools are being incorporated into the run-time environments of various hardware testbeds to evaluate their impact on user productivity. Our current tool set, the Ames Instrumentation Systems (AIMS), incorporates features from various software systems developed in academia and industry. The execution of FORTRAN programs on the Intel iPSC/860 can be automatically instrumented and monitored. Performance data collected in this manner can be displayed graphically on workstations supporting X-Windows. We have successfully compared various parallel algorithms for computational fluid dynamics (CFD) applications in collaboration with scientists from the Numerical Aerodynamic Simulation Systems Division. By performing these comparisons, we show that performance monitors and debuggers such as AIMS are practical and can illuminate the complex dynamics that occur within parallel programs

    A GPU Implementation for Two-Dimensional Shallow Water Modeling

    Full text link
    In this paper, we present a GPU implementation of a two-dimensional shallow water model. Water simulations are useful for modeling floods, river/reservoir behavior, and dam break scenarios. Our GPU implementation shows vast performance improvements over the original Fortran implementation. By taking advantage of the GPU, researchers and engineers will be able to study water systems more efficiently and in greater detail.Comment: 9 pages, 1 figur

    Acceleration of a Full-scale Industrial CFD Application with OP2

    Get PDF

    Phenomenology Tools on Cloud Infrastructures using OpenStack

    Get PDF
    We present a new environment for computations in particle physics phenomenology employing recent developments in cloud computing. On this environment users can create and manage "virtual" machines on which the phenomenology codes/tools can be deployed easily in an automated way. We analyze the performance of this environment based on "virtual" machines versus the utilization of "real" physical hardware. In this way we provide a qualitative result for the influence of the host operating system on the performance of a representative set of applications for phenomenology calculations.Comment: 25 pages, 12 figures; information on memory usage included, as well as minor modifications. Version to appear in EPJ

    DART-MPI: An MPI-based Implementation of a PGAS Runtime System

    Full text link
    A Partitioned Global Address Space (PGAS) approach treats a distributed system as if the memory were shared on a global level. Given such a global view on memory, the user may program applications very much like shared memory systems. This greatly simplifies the tasks of developing parallel applications, because no explicit communication has to be specified in the program for data exchange between different computing nodes. In this paper we present DART, a runtime environment, which implements the PGAS paradigm on large-scale high-performance computing clusters. A specific feature of our implementation is the use of one-sided communication of the Message Passing Interface (MPI) version 3 (i.e. MPI-3) as the underlying communication substrate. We evaluated the performance of the implementation with several low-level kernels in order to determine overheads and limitations in comparison to the underlying MPI-3.Comment: 11 pages, International Conference on Partitioned Global Address Space Programming Models (PGAS14
    • …
    corecore