415 research outputs found

    Parallel multithreading algorithms forself-gravity computation inESyS-Particle

    Get PDF
    This thesis describes the design, implementation, and evaluation of efficient algorithms for self-gravity simulations in astronomical agglomerates. Due to the intrinsic complexity of modeling interactions between particles, agglomerate are studied using computational simulations. Self-gravity affects every particle in agglomerates, which can be composed of millions of particles. So, to perform a realistic simulation is computationally expensive. This thesis presents three parallel multithreading algorithms for self-gravity calculation, including a method that updates the occupied cells on an underlying grid and a variation of the Barnes & Hut method that partitions and arranges the simulation space in both an octal and a binary tree to speed up long range forces calculation. The goal of the algorithms is to make efficient use of the underlying grid that maps the simulated environment. The three methods were evaluated and compared over two scenarios: two agglomerates orbiting each other and a collapsing cube. The experimental evaluation comprises the performance analysis of the two scenarios using the two methods, including a comparison of the results obtained and the analysis of the numerical accuracy by the study of the conservation of the center of mass and angular momentum. Both scenarios were evaluated scaling the number of computational resources to simulate instances with different number of particles. Results show that the proposed octal tree Barnes & Hut method allows improving the performance of the self-gravity calculation up to 100 with respect to the occupied cell method. This way, efficient simulations are performed for the largest problem instance including 2,097,152 particles. The proposed algorithms are efficient and accurate methods for self-gravity simulations in astronomical agglomerates

    CompF2: Theoretical Calculations and Simulation Topical Group Report

    Full text link
    This report summarizes the work of the Computational Frontier topical group on theoretical calculations and simulation for Snowmass 2021. We discuss the challenges, potential solutions, and needs facing six diverse but related topical areas that span the subject of theoretical calculations and simulation in high energy physics (HEP): cosmic calculations, particle accelerator modeling, detector simulation, event generators, perturbative calculations, and lattice QCD (quantum chromodynamics). The challenges arise from the next generations of HEP experiments, which will include more complex instruments, provide larger data volumes, and perform more precise measurements. Calculations and simulations will need to keep up with these increased requirements. The other aspect of the challenge is the evolution of computing landscape away from general-purpose computing on CPUs and toward special-purpose accelerators and coprocessors such as GPUs and FPGAs. These newer devices can provide substantial improvements for certain categories of algorithms, at the expense of more specialized programming and memory and data access patterns.Comment: Report of the Computational Frontier Topical Group on Theoretical Calculations and Simulation for Snowmass 202

    Cactus Framework: Black Holes to Gamma Ray Bursts

    Get PDF
    Gamma Ray Bursts (GRBs) are intense narrowly-beamed flashes of gamma-rays of cosmological origin. They are among the most scientifically interesting astrophysical systems, and the riddle concerning their central engines and emission mechanisms is one of the most complex and challenging problems of astrophysics today. In this article we outline our petascale approach to the GRB problem and discuss the computational toolkits and numerical codes that are currently in use and that will be scaled up to run on emerging petaflop scale computing platforms in the near future. Petascale computing will require additional ingredients over conventional parallelism. We consider some of the challenges which will be caused by future petascale architectures, and discuss our plans for the future development of the Cactus framework and its applications to meet these challenges in order to profit from these new architectures

    A Large Scale Simulation of Satellites Tracking Vessels and Other Targets

    Get PDF
    This research outlines the design of a large scale simulation of satellites tracking large amounts of dynamic targets. The use of such a simulation is presented and current solutions available are presented. The research sets out a list of objectives to meet by creating an application programming interface (API) that have the requirements of being efficient, scalable, flexible, and easy to use for the implementer. Methods of creating sections of the simulation such as the attitude motion of a satellite based on the physical characteristics of nanosatellites is explored and developed. The creation of targets that are contained only on certain land features are also developed and tested. The objectives set out are tested by creating a simulation using the API developed and the results are presented

    From Piz Daint to the Stars: Simulation of Stellar Mergers using High-Level Abstractions

    Get PDF
    We study the simulation of stellar mergers, which requires complex simulations with high computational demands. We have developed Octo-Tiger, a finite volume grid-based hydrodynamics simulation code with Adaptive Mesh Refinement which is unique in conserving both linear and angular momentum to machine precision. To face the challenge of increasingly complex, diverse, and heterogeneous HPC systems, Octo-Tiger relies on high-level programming abstractions. We use HPX with its futurization capabilities to ensure scalability both between nodes and within, and present first results replacing MPI with libfabric achieving up to a 2.8x speedup. We extend Octo-Tiger to heterogeneous GPU-accelerated supercomputers, demonstrating node-level performance and portability. We show scalability up to full system runs on Piz Daint. For the scenario's maximum resolution, the compute-critical parts (hydrodynamics and gravity) achieve 68.1% parallel efficiency at 2048 nodes.Comment: Accepted at SC1

    Energy challenges for ICT

    Get PDF
    The energy consumption from the expanding use of information and communications technology (ICT) is unsustainable with present drivers, and it will impact heavily on the future climate change. However, ICT devices have the potential to contribute signi - cantly to the reduction of CO2 emission and enhance resource e ciency in other sectors, e.g., transportation (through intelligent transportation and advanced driver assistance systems and self-driving vehicles), heating (through smart building control), and manu- facturing (through digital automation based on smart autonomous sensors). To address the energy sustainability of ICT and capture the full potential of ICT in resource e - ciency, a multidisciplinary ICT-energy community needs to be brought together cover- ing devices, microarchitectures, ultra large-scale integration (ULSI), high-performance computing (HPC), energy harvesting, energy storage, system design, embedded sys- tems, e cient electronics, static analysis, and computation. In this chapter, we introduce challenges and opportunities in this emerging eld and a common framework to strive towards energy-sustainable ICT

    A multi-GPU shallow-water simulation with transport of contaminants

    Get PDF
    [Abstract] This work presents cost-effective multi-graphics processing unit (GPU) parallel implementations of a finite-volume numerical scheme for solving pollutant transport problems in bidimensional domains. The fluid is modeled by 2D shallow-water equations, whereas the transport of pollutant is modeled by a transport equation. The 2D domain is discretized using a first-order Roe finite-volume scheme. Specifically, this paper presents multi-GPU implementations of both a solution that exploits recomputation on the GPU and an optimized solution that is based on a ghost cell decoupling approach. Our multi-GPU implementations have been optimized using nonblocking communications, overlapping communications and computations and the application of ghost cell expansion to minimize communications. The fastest one reached a speedup of 78 × using four GPUs on an InfiniBand network with respect to a parallel execution on a multicore CPU with six cores and two-way hyperthreading per core. Such performance, measured using a realistic problem, enabled the calculation of solutions not only in real time but also in orders of magnitude faster than the simulated time.Copyright © 2012 John Wiley & Sons, Ltd
    • …
    corecore