4,993 research outputs found

    Enable High-resolution, Real-time Ensemble Simulation and Data Assimilation of Flood Inundation using Distributed GPU Parallelization

    Full text link
    Numerical modeling of the intensity and evolution of flood events are affected by multiple sources of uncertainty such as precipitation and land surface conditions. To quantify and curb these uncertainties, an ensemble-based simulation and data assimilation model for pluvial flood inundation is constructed. The shallow water equation is decoupled in the x and y directions, and the inertial form of the Saint-Venant equation is chosen to realize fast computation. The probability distribution of the input and output factors is described using Monte Carlo samples. Subsequently, a particle filter is incorporated to enable the assimilation of hydrological observations and improve prediction accuracy. To achieve high-resolution, real-time ensemble simulation, heterogeneous computing technologies based on CUDA (compute unified device architecture) and a distributed storage multi-GPU (graphics processing unit) system are used. Multiple optimization skills are employed to ensure the parallel efficiency and scalability of the simulation program. Taking an urban area of Fuzhou, China as an example, a model with a 3-m spatial resolution and 4.0 million units is constructed, and 8 Tesla P100 GPUs are used for the parallel calculation of 96 model instances. Under these settings, the ensemble simulation of a 1-hour hydraulic process takes 2.0 minutes, which achieves a 2680 estimated speedup compared with a single-thread run on CPU. The calculation results indicate that the particle filter method effectively constrains simulation uncertainty while providing the confidence intervals of key hydrological elements such as streamflow, submerged area, and submerged water depth. The presented approaches show promising capabilities in handling the uncertainties in flood modeling as well as enhancing prediction efficiency

    Paraiso : An Automated Tuning Framework for Explicit Solvers of Partial Differential Equations

    Full text link
    We propose Paraiso, a domain specific language embedded in functional programming language Haskell, for automated tuning of explicit solvers of partial differential equations (PDEs) on GPUs as well as multicore CPUs. In Paraiso, one can describe PDE solving algorithms succinctly using tensor equations notation. Hydrodynamic properties, interpolation methods and other building blocks are described in abstract, modular, re-usable and combinable forms, which lets us generate versatile solvers from little set of Paraiso source codes. We demonstrate Paraiso by implementing a compressive hydrodynamics solver. A single source code less than 500 lines can be used to generate solvers of arbitrary dimensions, for both multicore CPUs and GPUs. We demonstrate both manual annotation based tuning and evolutionary computing based automated tuning of the program.Comment: 52 pages, 14 figures, accepted for publications in Computational Science and Discover

    High-performance tsunami modelling with modern GPU technology

    Get PDF
    PhD ThesisEarthquake-induced tsunamis commonly propagate in the deep ocean as long waves and develop into sharp-fronted surges moving rapidly coastward, which may be effectively simulated by hydrodynamic models solving the nonlinear shallow water equations (SWEs). Tsunamis can cause substantial economic and human losses, which could be mitigated through early warning systems given efficient and accurate modelling. Most existing tsunami models require long simulation times for real-world applications. This thesis presents a graphics processing unit (GPU) accelerated finite volume hydrodynamic model using the compute unified device architecture (CUDA) for computationally efficient tsunami simulations. Compared with a standard PC, the model is able to reduce run-time by a factor of > 40. The validated model is used to reproduce the 2011 Japan tsunami. Two source models were tested, one based on tsunami waveform inversion and another using deep-ocean tsunameters. Vertical sea surface displacement is computed by the Okada model, assuming instantaneous sea-floor deformation. Both source models can reproduce the wave propagation at offshore and nearshore gauges, but the tsunameter-based model better simulates the first wave amplitude. Effects of grid resolutions between 450-3600 m, slope limiters, and numerical accuracy are also investigated for the simulation of the 2011 Japan tsunami. Grid resolutions of 1-2 km perform well with a proper limiter; the Sweby limiter is optimal for coarser resolutions, recovers wave peaks better than minmod, and is more numerically stable than Superbee. One hour of tsunami propagation can be predicted in 50 times on a regular low-cost PC-hosted GPU, compared to a single CPU. For 450 m resolution on a larger-memory server-hosted GPU, performance increased by ~70 times. Finally, two adaptive mesh refinement (AMR) techniques including simplified dynamic adaptive grids on CPU and a static adaptive grid on GPU are introduced to provide multi-scale simulations. Both can reduce run-time by ~3 times while maintaining acceptable accuracy. The proposed computationally-efficient tsunami model is expected to provide a new practical tool for tsunami modelling for different purposes, including real-time warning, evacuation planning, risk management and city planning

    An efficient GPU implementation for a faster simulation of unsteady bed-load transport

    Get PDF
    Computational tools may help engineers in the assessment of sediment transport during the decision-making processes. The main requirements are that the numerical results have to be accurate and simulation models must be fast. The present work is based on the 2D shallow water equations in combination with the 2D Exner equation. The resulting numerical model accuracy was already discussed in previous work. Regarding the speed of the computation, the Exner equation slows down the already costly 2D shallow water model as the number of variables to solve is increased and the numerical stability is more restrictive. In order to reduce the computational effort required for simulating realistic scenarios, the authors have exploited the use of Graphics Processing Units in combination with non-trivial optimization procedures. The gain in computing cost obtained with the graphic hardware is compared against single-core (sequential) and multi-core (parallel) CPU implementations in two unsteady cases

    Simulating water-entry/exit problems using Eulerian-Lagrangian and fully-Eulerian fictitious domain methods within the open-source IBAMR library

    Full text link
    In this paper we employ two implementations of the fictitious domain (FD) method to simulate water-entry and water-exit problems and demonstrate their ability to simulate practical marine engineering problems. In FD methods, the fluid momentum equation is extended within the solid domain using an additional body force that constrains the structure velocity to be that of a rigid body. Using this formulation, a single set of equations is solved over the entire computational domain. The constraint force is calculated in two distinct ways: one using an Eulerian-Lagrangian framework of the immersed boundary (IB) method and another using a fully-Eulerian approach of the Brinkman penalization (BP) method. Both FSI strategies use the same multiphase flow algorithm that solves the discrete incompressible Navier-Stokes system in conservative form. A consistent transport scheme is employed to advect mass and momentum in the domain, which ensures numerical stability of high density ratio multiphase flows involved in practical marine engineering applications. Example cases of a free falling wedge (straight and inclined) and cylinder are simulated, and the numerical results are compared against benchmark cases in literature.Comment: The current paper builds on arXiv:1901.07892 and re-explains some parts of it for the reader's convenienc

    Zynq SoC based acceleration of the lattice Boltzmann method

    Get PDF
    Cerebral aneurysm is a life‐threatening condition. It is a weakness in a blood vessel that may enlarge and bleed into the surrounding area. In order to understand the surrounding environmental conditions during the interventions or surgical procedures, a simulation of blood flow in cerebral arteries is needed. One of the effective simulation approaches is to use the lattice Boltzmann (LB) method. Due to the computational complexity of the algorithm, the simulation is usually performed on high performance computers. In this paper, efficient hardware architectures of the LB method on a Zynq system‐on‐chip (SoC) are designed and implemented. The proposed architectures have first been simulated in Vivado HLS environment and later implemented on a ZedBoard using the software‐defined SoC (SDSoC) development environment. In addition, a set of evaluations of different hardware architectures of the LB implementation is discussed in this paper. The experimental results show that the proposed implementation is able to accelerate the processing speed by a factor of 52 compared to a dual‐core ARM processor‐based software implementation
    • 

    corecore