40 research outputs found

    Sparse approximate inverse preconditioners on high performance GPU platforms

    Get PDF
    Simulation with models based on partial differential equations often requires the solution of (sequences of) large and sparse algebraic linear systems. In multidimensional domains, preconditioned Krylov iterative solvers are often appropriate for these duties. Therefore, the search for efficient preconditioners for Krylov subspace methods is a crucial theme. Recent developments, especially in computing hardware, have renewed the interest in approximate inverse preconditioners in factorized form, because their application during the solution process can be more efficient. We present here some experiences focused on the approximate inverse preconditioners proposed by Benzi and Tůma from 1996 and the sparsification and inversion proposed by van Duin in 1999. Computational costs, reorderings and implementation issues are considered both on conventional and innovative computing architectures like Graphics Programming Units (GPUs)

    Efficient Long-Term Simulation of the Heat Equation with Application in Geothermal Energy Storage

    Get PDF
    Long-term evolutions of parabolic partial differential equations, such as the heat equation, are the subject of interest in many applications. There are several numerical solvers marking the state-of-the-art in diverse scientific fields that may be used with benefit for the numerical simulation of such long-term scenarios. We show how to adapt some of the currently most efficient numerical approaches for solving the fundamental problem of long-term linear heat evolution with internal and external boundary conditions as well as source terms. Such long-term simulations are required for the optimal dimensioning of geothermal energy storages and their profitability assessment, for which we provide a comprehensive analytical and numerical model. Implicit methods are usually considered the best choice for resolving long-term simulations of linear parabolic problems; however, in practice the efficiency of such schemes in terms of the combination of computational load and obtained accuracy may be a delicate issue, as it depends very much on the properties of the underlying model. For example, one of the challenges in long-term simulation may arise by the presence of time-dependent boundary conditions, as in our application. In order to provide both a computationally efficient and accurate enough simulation, we give a thorough discussion of the various numerical solvers along with many technical details and own adaptations. By our investigation, we focus on two largely competitive approaches for our application, namely the fast explicit diffusion method originating in image processing and an adaptation of the Krylov subspace model order reduction method. We validate our numerical findings via several experiments using synthetic and real-world data. We show that we can obtain fast and accurate long-term simulations of typical geothermal energy storage facilities. We conjecture that our techniques can be highly useful for tackling long-term heat evolution in many applications

    Resilience for Asynchronous Iterative Methods for Sparse Linear Systems

    Get PDF
    Large scale simulations are used in a variety of application areas in science and engineering to help forward the progress of innovation. Many spend the vast majority of their computational time attempting to solve large systems of linear equations; typically arising from discretizations of partial differential equations that are used to mathematically model various phenomena. The algorithms used to solve these problems are typically iterative in nature, and making efficient use of computational time on High Performance Computing (HPC) clusters involves constantly improving these iterative algorithms. Future HPC platforms are expected to encounter three main problem areas: scalability of code, reliability of hardware, and energy efficiency of the platform. The HPC resources that are expected to run the large programs are planned to consist of billions of processing units that come from more traditional multicore processors as well as a variety of different hardware accelerators. This growth in parallelism leads to the presence of all three problems. Previously, work on algorithm development has focused primarily on creating fault tolerance mechanisms for traditional iterative solvers. Recent work has begun to revisit using asynchronous methods for solving large scale applications, and this dissertation presents research into fault tolerance for fine-grained methods that are asynchronous in nature. Classical convergence results for asynchronous methods are revisited and modified to account for the possible occurrence of a fault, and a variety of techniques for recovery from the effects of a fault are proposed. Examples of how these techniques can be used are shown for various algorithms, including an analysis of a fine-grained algorithm for computing incomplete factorizations. Lastly, numerous modeling and simulation tools for the further construction of iterative algorithms for HPC applications are developed, including numerical models for simulating faults and a simulation framework that can be used to extrapolate the performance of algorithms towards future HPC systems

    Schnelle Löser für Partielle Differentialgleichungen

    Get PDF
    The workshop Schnelle Löser für partielle Differentialgleichungen, organised by Randolph E. Bank (La Jolla), Wolfgang Hackbusch (Leipzig), and Gabriel Wittum (Frankfurt am Main), was held May 22nd–May 28th, 2011. This meeting was well attended by 54 participants with broad geographic representation from 7 countries and 3 continents. This workshop was a nice blend of researchers with various backgrounds

    Parallel computation techniques for virtual acoustics and physical modelling synthesis

    Get PDF
    The numerical simulation of large-scale virtual acoustics and physical modelling synthesis is a computationally expensive process. Time stepping methods, such as finite difference time domain, can be used to simulate wave behaviour in models of three-dimensional room acoustics and virtual instruments. In the absence of any form of simplifying assumptions, and at high audio sample rates, this can lead to simulations that require many hours of computation on a standard Central Processing Unit (CPU). In recent years the video game industry has driven the development of Graphics Processing Units (GPUs) that are now capable of multi-teraflop performance using highly parallel architectures. Whilst these devices are primarily designed for graphics calculations, they can also be used for general purpose computing. This thesis explores the use of such hardware to accelerate simulations of three-dimensional acoustic wave propagation, and embedded systems that create physical models for the synthesis of sound. Test case simulations of virtual acoustics are used to compare the performance of workstation CPUs to that of Nvidia’s Tesla GPU hardware. Using representative multicore CPU benchmarks, such simulations can be accelerated in the order of 5X for single precision and 3X for double precision floating-point arithmetic. Optimisation strategies are examined for maximising GPU performance when using single devices, as well as for multiple device codes that can compute simulations using billions of grid points. This allows the simulation of room models of several thousand cubic metres at audio rates such as 44.1kHz, all within a useable time scale. The performance of alternative finite difference schemes is explored, as well as strategies for the efficient implementation of boundary conditions. Creating physical models of acoustic instruments requires embedded systems that often rely on sparse linear algebra operations. The performance efficiency of various sparse matrix storage formats is detailed in terms of the fundamental operations that are required to compute complex models, with an optimised storage system achieving substantial performance gains over more generalised formats. An integrated instrument model of the timpani drum is used to demonstrate the performance gains that are possible using the optimisation strategies developed through this thesis
    corecore