547 research outputs found

    From Physics Model to Results: An Optimizing Framework for Cross-Architecture Code Generation

    Full text link
    Starting from a high-level problem description in terms of partial differential equations using abstract tensor notation, the Chemora framework discretizes, optimizes, and generates complete high performance codes for a wide range of compute architectures. Chemora extends the capabilities of Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient manner for complex applications, without low-level code tuning. Chemora achieves parallelism through MPI and multi-threading, combining OpenMP and CUDA. Optimizations include high-level code transformations, efficient loop traversal strategies, dynamically selected data and instruction cache usage strategies, and JIT compilation of GPU code tailored to the problem characteristics. The discretization is based on higher-order finite differences on multi-block domains. Chemora's capabilities are demonstrated by simulations of black hole collisions. This problem provides an acid test of the framework, as the Einstein equations contain hundreds of variables and thousands of terms.Comment: 18 pages, 4 figures, accepted for publication in Scientific Programmin

    Heterogeneous hierarchical workflow composition

    Get PDF
    Workflow systems promise scientists an automated end-to-end path from hypothesis to discovery. However, expecting any single workflow system to deliver such a wide range of capabilities is impractical. A more practical solution is to compose the end-to-end workflow from more than one system. With this goal in mind, the integration of task-based and in situ workflows is explored, where the result is a hierarchical heterogeneous workflow composed of subworkflows, with different levels of the hierarchy using different programming, execution, and data models. Materials science use cases demonstrate the advantages of such heterogeneous hierarchical workflow composition.This work is a collaboration between Argonne National Laboratory and the Barcelona Supercomputing Center within the Joint Laboratory for Extreme-Scale Computing. This research is supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, under contract number DE-AC02- 06CH11357, program manager Laura Biven, and by the Spanish Government (SEV2015-0493), by the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P), by Generalitat de Catalunya (contract 2014-SGR-1051).Peer ReviewedPostprint (author's final draft

    PetaFlow: a global computing-networking-visualisation unitwith social impact

    Get PDF
    International audienceThe PetaFlow application aims to contribute to the use of high performance computational resources forthe benefit of society. To this goal the emergence of adequate information and communication technologies withrespect to high performance computing-networking-visualisation and their mutual awareness is required. Thedeveloped technology and algorithms are presented and applied to a real global peta-scale data intensive scientificproblem with social and medical importance, i.e. human upper airflow modelling

    Parallelization and integration of fire simulations in the Uintah PSE

    Get PDF
    Journal ArticleA physics-based stand-alone serial code for fire simulations is integrated in a unified computational framework to couple with other disciplines and to achieve massively parallel computation. Uintah, the computational framework used, is a component-based visual problem-solving environment developed at the University of Utah. It provides the framework for large-scale parallelization for different applications. The integration of the legacy fire code in Uintah is built on three principles: 1)Develop different reusable physics-based components that can be used interchangeably and interact with other components, 2) reuse the legacy stand-alone fire code (written in Fortran) as much as possible, and 3) use components developed by third parties, specifically non-linear and linear solvers designed for solving complex-flow problems. A helium buoyant plume is simulated using the Nirvana machine at Los Alamos National Laboratory. Linear scalability is achieved up to 128 processors. Issues related to scaling beyond 128 processors are also discussed

    Design and optimization of a portable LQCD Monte Carlo code using OpenACC

    Full text link
    The present panorama of HPC architectures is extremely heterogeneous, ranging from traditional multi-core CPU processors, supporting a wide class of applications but delivering moderate computing performance, to many-core GPUs, exploiting aggressive data-parallelism and delivering higher performances for streaming computing applications. In this scenario, code portability (and performance portability) become necessary for easy maintainability of applications; this is very relevant in scientific computing where code changes are very frequent, making it tedious and prone to error to keep different code versions aligned. In this work we present the design and optimization of a state-of-the-art production-level LQCD Monte Carlo application, using the directive-based OpenACC programming model. OpenACC abstracts parallel programming to a descriptive level, relieving programmers from specifying how codes should be mapped onto the target architecture. We describe the implementation of a code fully written in OpenACC, and show that we are able to target several different architectures, including state-of-the-art traditional CPUs and GPUs, with the same code. We also measure performance, evaluating the computing efficiency of our OpenACC code on several architectures, comparing with GPU-specific implementations and showing that a good level of performance-portability can be reached.Comment: 26 pages, 2 png figures, preprint of an article submitted for consideration in International Journal of Modern Physics

    CAVE 3D: Software Extensions for Scientific Visualization of Large-scale Models

    Get PDF
    AbstractNumerical analysis of large-scale and multidisciplinary problems on high-performance computer systems is one of the main computational challenges of the 21st century. The amount of data processed in complex systems analyses approaches peta- and exascale. The technical possibility for real-time visualization, post-processing and analysis of large-scale models is extremely important for carrying out comprehensive numerical studies. Powerful visualization is going to play an important role in the future of large-scale models. In this paper, we describe several software extensions aimed to improve visualization performance for large-scale models and developed by our team for 3D virtual environment systems such as CAVEs and Powerwalls. These extensions include an algorithm for real-time generation of isosurfaces on large meshes and a visualization system designed for massively parallel computing environment. Besides, we describe an augmented reality system developed by the part of our team in Stuttgart
    • …
    corecore