149 research outputs found

    The Mont-Blanc Project: First Phase Successfully Finished

    Full text link
    Running from October 2011 to June 2015, the aim of the European project Mont-Blanc has been to develop an approach to Exascale computing based on embedded power-efficient technology. The main goals of the project were to i) build an HPC prototype using currently available energy-efficient embedded technology, ii) design a Next Generation system to overcome the limitations of the built prototype and iii) port a set of representative Exascale applications to the system. This article summarises the contributions from the Leibniz Supercomputing Centre (LRZ) and the Juelich Supercomputing Centre (JSC), Germany, to the Mont-Blanc project.Comment: 5 pages, 3 figure

    Integrating an N -Body Problem with SDC and PFASST

    Get PDF
    Vortex methods for the Navier–Stokes equations are based on a Lagrangian particle discretization, which reduces the governing equations to a first-order initial value system of ordinary differential equations for the position and vorticity of N particles. In this paper, the accuracy of solving this system by time-serial spectral deferred corrections (SDC) as well as by the time-parallel Parallel Full Approximation Scheme in Space and Time (PFASST) is investigated. PFASST is based on intertwining SDC iterations with differing resolution in a manner similar to the Parareal algorithm and uses a Full Approximation Scheme (FAS) correction to improve the accuracy of coarser SDC iterations. It is demonstrated that SDC and PFASST can generate highly accurate solutions, and the performance in terms of function evaluations required for a certain accuracy is analyzed and compared to a standard Runge–Kutta method

    A space-time parallel solver for the three-dimensional heat equation

    Get PDF
    The paper presents a combination of the time-parallel “parallel full approximation scheme in space and time” (PFASST) with a parallel multigrid method (PMG) in space, resulting in a mesh-based solver for the three-dimensional heat equation with a uniquely high degree of efficient concurrency. Parallel scaling tests are reported on the Cray XE6 machine “Monte Rosa” on up to 16,38416,384 cores and on the IBM Blue Gene/Q system “JUQUEEN” on up to 65,53665,536 cores. The efficacy of the combined spatial- and temporal parallelization is shown by demonstrating that using PFASST in addition to PMG significantly extends the strong-scaling limit. Implications of using spatial coarsening strategies in PFASST’s multi-level hierarchy in large-scale parallel simulations are discussed

    The gravitational billion body problem : Het miljard deeltjes probleem

    Get PDF
    The increased availability of accelerator technology in modern supercomputers forces users to redesign their algorithms. These accelerators are specifically designed to offer huge amounts of parallel compute power. In this thesis I show how to harness the power of these parallel processors for astrophysical simulations. I start with an introduction that presents the developments in astrophysical algorithms and used hardware since the 1960__s till today. In the following scientific chapters I discuss the use of GPU accelerator technology for direct N-body methods and for the more advanced hierarchical algorithms. These advanced algorithms are more complex to implement on large parallel architectures, but by redesigning the algorithms it is possible to take advantage of the GPU. The developed algorithms are applied to simulate galaxy mergers to explain discrepancies in observational results. In the simulations we test different merger configurations and try to match the results with observational data. The final chapter shows how to scale the developed software code to thousands of GPUs as available in the Titan supercomputer. The in this thesis developed and presented algorithms allow astronomers to take advantage of the new GPU technology and thereby run simulations that contain thousand times more particles than was possible beforeNWOUBL - phd migration 201

    Fast Gravitational Approach for Rigid Point Set Registration with Ordinary Differential Equations

    Get PDF
    This article introduces a new physics-based method for rigid point set alignment called Fast Gravitational Approach (FGA). In FGA, the source and target point sets are interpreted as rigid particle swarms with masses interacting in a globally multiply-linked manner while moving in a simulated gravitational force field. The optimal alignment is obtained by explicit modeling of forces acting on the particles as well as their velocities and displacements with second-order ordinary differential equations of motion. Additional alignment cues (point-based or geometric features, and other boundary conditions) can be integrated into FGA through particle masses. We propose a smooth-particle mass function for point mass initialization, which improves robustness to noise and structural discontinuities. To avoid prohibitive quadratic complexity of all-to-all point interactions, we adapt a Barnes-Hut tree for accelerated force computation and achieve quasilinear computational complexity. We show that the new method class has characteristics not found in previous alignment methods such as efficient handling of partial overlaps, inhomogeneous point sampling densities, and coping with large point clouds with reduced runtime compared to the state of the art. Experiments show that our method performs on par with or outperforms all compared competing non-deep-learning-based and general-purpose techniques (which do not assume the availability of training data and a scene prior) in resolving transformations for LiDAR data and gains state-of-the-art accuracy and speed when coping with different types of data disturbances.Comment: 18 pages, 18 figures and two table

    A Space and Bandwidth Efficient Multicore Algorithm for the Particle-in-Cell Method

    Get PDF
    International audienceThe Particle-in-Cell (PIC) method allows solving partial differential equation through simulations, with important applications in plasma physics. To simulate thousands of billions of particles on clusters of multicore machines, prior work has proposed hybrid algorithms that combine domain decomposition and particle decomposition with carefully optimized algorithms for handling particles processed on each multicore socket. Regarding the multicore processing, existing algorithms either suffer from suboptimal execution time, due to sorting operations or use of atomic instructions, or suffer from suboptimal space usage. In this paper, we propose a novel parallel algorithm for two-dimensional PIC simulations on multicore hardware that features asymptotically-optimal memory consumption, and does not perform unnecessary accesses to the main memory. In practice, our algorithm reaches 65% of the maximum bandwidth, and shows excellent scalability on the classical Landau damping and two-stream instability test cases

    Afterlive: A performant code for Vlasov-Hybrid simulations

    Full text link
    A parallelized implementation of the Vlasov-Hybrid method [Nunn, 1993] is presented. This method is a hybrid between a gridded Eulerian description and Lagrangian meta-particles. Unlike the Particle-in-Cell method [Dawson, 1983] which simply adds up the contribution of meta-particles, this method does a reconstruction of the distribution function ff in every time step for each species. This interpolation method combines meta-particles with different weights in such a way that particles with large weight do not drown out particles that represent small contributions to the phase space density. These core properties allow the use of a much larger range of macro factors and can thus represent a much larger dynamic range in phase space density. The reconstructed phase space density ff is used to calculate momenta of the distribution function such as the charge density ρ\rho. The charge density ρ\rho is also used as input into a spectral solver that calculates the self-consistent electrostatic field which is used to update the particles for the next time-step. Afterlive (A Fourier-based Tool in the Electrostatic limit for the Rapid Low-noise Integration of the Vlasov Equation) is fully parallelized using MPI and writes output using parallel HDF5. The input to the simulation is read from a JSON description that sets the initial particle distributions as well as domain size and discretization constraints. The implementation presented here is intentionally limited to one spatial dimension and resolves one or three dimensions in velocity space. Additional spatial dimensions can be added in a straight forward way, but make runs computationally even more costly.Comment: Accepted for publication in Computer Physics Communication
    • 

    corecore