Search CORE

18 research outputs found

Multi-stage high order semi-Lagrangian schemes for incompressible flows in Cartesian geometries

Author: Butcher
Celledoni
Chandrasekhar
Chorin
Courant
Courant
Dupont
Dupont
Durran
Fromm
Guermond
Hirsch
Huang
Kim
Knorr
Lentine
Leonard
LeVeque
Liu
MacCormack
Nakamura
Oliveira
Qiu
Rayleigh
Robert
Robert
Shoucri
Sonnendrücker
Staniforth
Strang
Wu
Xiao
Zerroukat
Publication venue: 'Wiley'
Publication date: 12/04/2016
Field of study

Efficient transport algorithms are essential to the numerical resolution of incompressible fluid flow problems. Semi-Lagrangian methods are widely used in grid based methods to achieve this aim. The accuracy of the interpolation strategy then determines the properties of the scheme. We introduce a simple multi-stage procedure which can easily be used to increase the order of accuracy of a code based on multi-linear interpolations. This approach is an extension of a corrective algorithm introduced by Dupont \& Liu (2003, 2007). This multi-stage procedure can be easily implemented in existing parallel codes using a domain decomposition strategy, as the communications pattern is identical to that of the multi-linear scheme. We show how a combination of a forward and backward error correction can provide a third-order accurate scheme, thus significantly reducing diffusive effects while retaining a non-dispersive leading error term.Comment: 14 pages, 10 figure

arXiv.org e-Print Archive

CUDA Implementation of a Navier-Stokes Solver on Multi-GPU Desktop Platforms for Incompressible Flows

Author: Bleiweiss A.
Buck I.
Fan Z.
Hennessy J. L.
Li W.
Liu Y.
Molemaker J.
Ryoo S.
Schatz M. C.
Stratton J. A.
The MPI
Ufimtsev I.
Publication venue: 'IUScholarWorks'
Publication date: 01/01/2009
Field of study

Graphics processor units (GPU) that are traditionally designed for graphics rendering have emerged as massively-parallel co-processors to the central processing unit (CPU). Small-footprint desktop supercomputers with hundreds of cores that can deliver teraflops peak performance at the price of conventional workstations have been realized. A computational fluid dynamics (CFD) simulation capability with rapid computational turnaround time has the potential to transform engineering analysis and design optimization procedures. We describe the implementation of a Navier-Stokes solver for incompressible fluid flow using desktop platforms equipped with multi-GPUs. Specifically, NVIDIA’s Compute Unified Device Architecture (CUDA) programming model is used to implement the discretized form of the governing equations. The projection algorithm to solve the incompressible fluid flow equations is divided into distinct CUDA kernels, and a unique implementation that exploits the memory hierarchy of the CUDA programming model is suggested. Using a quad-GPU platform, we observe two orders of magnitude speedup relative to a serial CPU implementation. Our results demonstrate that multi-GPU desktops can serve as a cost-effective small-footprint parallel computing platform to accelerate CFD simulations substantially. I. Introductio

Crossref

Boise State University - ScholarWorks

An MPI-CUDA Implementation for Massively Parallel Incompressible Flow Computations on Multi-GPU Clusters

Author: Bolz J.
Brandvik T.
Buck I.
Elsen E.
Fan Z.
Goodnight N.
Griebel M.
Gropp W.
Göddeke D.
Göddeke D.
Göddeke D.
Harris M.J.
Hempel R.
Intel
Kindratenko V.
Krüger J.
Liu Y.
Owens J.D.
Schive H.
Showerman M.
Simek V.
Tölke J.
Wan D.C.
Zhao Y.
Publication venue: 'IUScholarWorks'
Publication date: 01/01/2010
Field of study

Modern graphics processing units (GPUs) with many-core architectures have emerged as general-purpose parallel computing platforms that can accelerate simulation science applications tremendously. While multi-GPU workstations with several TeraFLOPS of peak computing power are available to accelerate computational problems, larger problems require even more resources. Conventional clusters of central processing units (CPU) are now being augmented with multiple GPUs in each compute-node to tackle large problems. The heterogeneous architecture of a multi-GPU cluster with a deep memory hierarchy creates unique challenges in developing scalable and efficient simulation codes. In this study, we pursue mixed MPI-CUDA implementations and investigate three strategies to probe the efficiency and scalability of incompressible flow computations on the Lincoln Tesla cluster at the National Center for Supercomputing Applications (NCSA). We exploit some of the advanced features of MPI and CUDA programming to overlap both GPU data transfer and MPI communications with computations on the GPU. We sustain approximately 2.4 TeraFLOPS on the 64 nodes of the NCSA Lincoln Tesla cluster using 128 GPUs with a total of 30,720 processing elements. Our results demonstrate that multi-GPU clusters can substantially accelerate computational fluid dynamics (CFD) simulations

Crossref

Boise State University - ScholarWorks

GPU-friendly marching cubes.

Author
Publication venue
Publication date: 01/01/2008
Field of study

Xie, Yongming.Thesis (M.Phil.)--Chinese University of Hong Kong, 2008.Includes bibliographical references (leaves 77-85).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.iiChapter 1 --- Introduction --- p.1Chapter 1.1 --- Isosurfaces --- p.1Chapter 1.2 --- Graphics Processing Unit --- p.2Chapter 1.3 --- Objective --- p.3Chapter 1.4 --- Contribution --- p.3Chapter 1.5 --- Thesis Organization --- p.4Chapter 2 --- Marching Cubes --- p.5Chapter 2.1 --- Introduction --- p.5Chapter 2.2 --- Marching Cubes Algorithm --- p.7Chapter 2.3 --- Triangulated Cube Configuration Table --- p.12Chapter 2.4 --- Summary --- p.16Chapter 3 --- Graphics Processing Unit --- p.18Chapter 3.1 --- Introduction --- p.18Chapter 3.2 --- History of Graphics Processing Unit --- p.19Chapter 3.2.1 --- First Generation GPU --- p.20Chapter 3.2.2 --- Second Generation GPU --- p.20Chapter 3.2.3 --- Third Generation GPU --- p.20Chapter 3.2.4 --- Fourth Generation GPU --- p.21Chapter 3.3 --- The Graphics Pipelining --- p.21Chapter 3.3.1 --- Standard Graphics Pipeline --- p.21Chapter 3.3.2 --- Programmable Graphics Pipeline --- p.23Chapter 3.3.3 --- Vertex Processors --- p.25Chapter 3.3.4 --- Fragment Processors --- p.26Chapter 3.3.5 --- Frame Buffer Operations --- p.28Chapter 3.4 --- GPU CPU Analogy --- p.31Chapter 3.4.1 --- Memory Architecture --- p.31Chapter 3.4.2 --- Processing Model --- p.32Chapter 3.4.3 --- Limitation of GPU --- p.33Chapter 3.4.4 --- Input and Output --- p.34Chapter 3.4.5 --- Data Readback --- p.34Chapter 3.4.6 --- FramebufFer --- p.34Chapter 3.5 --- Summary --- p.35Chapter 4 --- Volume Rendering --- p.37Chapter 4.1 --- Introduction --- p.37Chapter 4.2 --- History of Volume Rendering --- p.38Chapter 4.3 --- Hardware Accelerated Volume Rendering --- p.40Chapter 4.3.1 --- Hardware Acceleration Volume Rendering Methods --- p.41Chapter 4.3.2 --- Proxy Geometry --- p.42Chapter 4.3.3 --- Object-Aligned Slicing --- p.43Chapter 4.3.4 --- View-Aligned Slicing --- p.45Chapter 4.4 --- Summary --- p.48Chapter 5 --- GPU-Friendly Marching Cubes --- p.49Chapter 5.1 --- Introduction --- p.49Chapter 5.2 --- Previous Work --- p.50Chapter 5.3 --- Traditional Method --- p.52Chapter 5.3.1 --- Scalar Volume Data --- p.53Chapter 5.3.2 --- Isosurface Extraction --- p.53Chapter 5.3.3 --- Flow Chart --- p.54Chapter 5.3.4 --- Transparent Isosurfaces --- p.56Chapter 5.4 --- Our Method --- p.56Chapter 5.4.1 --- Cell Selection --- p.59Chapter 5.4.2 --- Vertex Labeling --- p.61Chapter 5.4.3 --- Cell Indexing --- p.62Chapter 5.4.4 --- Interpolation --- p.65Chapter 5.5 --- Rendering Translucent Isosurfaces --- p.67Chapter 5.6 --- Implementation and Results --- p.69Chapter 5.7 --- Summary --- p.74Chapter 6 --- Conclusion --- p.76Bibliography --- p.7

CUHK Digital Repository

Recommended from our members

General Purpose Programming on Modern Graphics Hardware

Author: Fleming Robert
Publication venue: 'University of North Texas Libraries'
Publication date: 01/05/2008
Field of study

I start with a brief introduction to the graphics processing unit (GPU) as well as general-purpose computation on modern graphics hardware (GPGPU). Next, I explore the motivations for GPGPU programming, and the capabilities of modern GPUs (including advantages and disadvantages). Also, I give the background required for further exploring GPU programming, including the terminology used and the resources available. Finally, I include a comprehensive survey of previous and current GPGPU work, and end with a look at the future of GPU programming

UNT Digital Library

real-time 3d fluid simulation on gpu with complex obstacles

Author: Liu X
Liu Y
Wui E
Publication venue: 12TH PACIFIC CONFERENCE ON COMPUTER GRAPHICS AND APPLICATIONS, PROCEEDINGS
Publication date: 01/01/2004
Field of study

In this paper, we solve the 3D fluid dynamics problem in a complex environment by taking advantage of the parallelism and programmability of GPU. In difference from other methods, innovation is made in two aspects. Firstly, more general bound

Institute Of Software, Chinese Academy Of Sciences

real-time 3d fluid simulation on gpu with complex obstacles

Author: 刘学慧
吴恩华
柳有权
Publication venue
Publication date: 01/01/2006
Field of study

在GPU(grhics processing unit)上求解了复杂场景中的三维流动问题,充分利用了GPU并行能力以加速计算.与前人的方法不同,该方法对于边界条件的处理更为通用.首先,通过在图像空间生成实心的剖切截面构成整个障碍物信息图,算法使得流体计算与整个几何场景的复杂度无关,通过对各体素进行分类并结合边界条件,根据障碍物形成修正因子来修改对应的值;另外,采用更为紧凑的数据格式,以充分利用硬件的并行性.通过将所有标量的运算压缩到纹元的4个颜色通道并结合平铺三维纹理,减少了三维流场计算所需要的绘制次数

Institute Of Software, Chinese Academy Of Sciences

Simulación de fluidos en aplicaciones de tiempo real

Author: Fages de la Canal Ramiro
Publication venue
Publication date: 24/05/2018
Field of study

Esta tesina aborda la simulación de fluidos en tiempo real utilizando la GPU, enfocándose en el modelado del humo. Comienza presentando los conceptos teóricos relacionados con la misma, partiendo de las ecuaciones de Navier-Stokes y finaliza con la visualización en pantalla de los resultados. La implementación de la simulación y visualización se realiza en dos dimensiones utilizando la GPU, junto con las APIs de DirectCompute y Direct3D.Tesis digitalizada gracias a la colaboración de la Biblioteca de la Facultad de Informática.Facultad de Informátic