Search CORE

188 research outputs found

A parallel Heap-Cell Method for Eikonal equations

Author: Chacon Adam
Vladimirsky Alexander
Publication venue
Publication date: 01/10/2014
Field of study

Numerous applications of Eikonal equations prompted the development of many efficient numerical algorithms. The Heap-Cell Method (HCM) is a recent serial two-scale technique that has been shown to have advantages over other serial state-of-the-art solvers for a wide range of problems. This paper presents a parallelization of HCM for a shared memory architecture. The numerical experiments in

R^3

show that the parallel HCM exhibits good algorithmic behavior and scales well, resulting in a very fast and practical solver. We further explore the influence on performance and scaling of data precision, early termination criteria, and the hardware architecture. A shorter version of this manuscript (omitting these more detailed tests) has been submitted to SIAM Journal on Scientific Computing in 2012.Comment: (a minor update to address the reviewers' comments) 31 pages; 15 figures; this is an expanded version of a paper accepted by SIAM Journal on Scientific Computin

arXiv.org e-Print Archive

CiteSeerX

Massively Parallel Algorithm for Solving the Eikonal Equation on Multiple Accelerator Platforms

Author: Shrestha Anup
Publication venue: 'IUScholarWorks'
Publication date: 01/12/2016
Field of study

The research presented in this thesis investigates parallel implementations of the Fast Sweeping Method (FSM) for Graphics Processing Unit (GPU)-based computational plat forms and proposes a new parallel algorithm for distributed computing platforms with accelerators. Hardware accelerators such as GPUs and co-processors have emerged as general- purpose processors in today’s high performance computing (HPC) platforms, thereby increasing platforms’ performance capabilities. This trend has allowed greater parallelism and substantial acceleration of scientific simulation software. In order to leverage the power of new HPC platforms, scientific applications must be written in specific lower-level programming languages, which used to be platform specific. Newer programming models such as OpenACC simplifies implementation and assures portability of applications to run across GPUs from different vendors and multi-core processors. The distance field is a representation of a surface geometry or shape required by many algorithms within the areas of computer graphics, visualization, computational fluid dynamics and more. It can be calculated by solving the eikonal equation using the FSM. The parallel FSMs explored in this thesis have not been implemented on GPU platforms and do not scale to a large problem size. This thesis addresses this problem by designing a parallel algorithm that utilizes a domain decomposition strategy for multi-accelerated distributed platforms. The proposed algorithm applies first coarse grain parallelism using MPI to distribute subdomains across multiple nodes and then fine grain parallelism to optimize performance by utilizing accelerators. The results of the parallel implementations of FSM for GPU-based platforms showed speedup greater than 20× compared to the serial version for some problems and the newly developed parallel algorithm eliminates the limitation of current algorithms to solve large memory problems with comparable runtime efficiency

Boise State University - ScholarWorks

Doctor of Philosophy

Author: Fu Zhisong
Publication venue: University of Utah
Publication date: 01/12/2013
Field of study

dissertationPartial differential equations (PDEs) are widely used in science and engineering to model phenomena such as sound, heat, and electrostatics. In many practical science and engineering applications, the solutions of PDEs require the tessellation of computational domains into unstructured meshes and entail computationally expensive and time-consuming processes. Therefore, efficient and fast PDE solving techniques on unstructured meshes are important in these applications. Relative to CPUs, the faster growth curves in the speed and greater power efficiency of the SIMD streaming processors, such as GPUs, have gained them an increasingly important role in the high-performance computing area. Combining suitable parallel algorithms and these streaming processors, we can develop very efficient numerical solvers of PDEs. The contributions of this dissertation are twofold: proposal of two general strategies to design efficient PDE solvers on GPUs and the specific applications of these strategies to solve different types of PDEs. Specifically, this dissertation consists of four parts. First, we describe the general strategies, the domain decomposition strategy and the hybrid gathering strategy. Next, we introduce a parallel algorithm for solving the eikonal equation on fully unstructured meshes efficiently. Third, we present the algorithms and data structures necessary to move the entire FEM pipeline to the GPU. Fourth, we propose a parallel algorithm for solving the levelset equation on fully unstructured 2D or 3D meshes or manifolds. This algorithm combines a narrowband scheme with domain decomposition for efficient levelset equation solving

The University of Utah: J. Willard Marriott Digital Library