Search CORE

120 research outputs found

완화입자유체동역학을 위한 GPU 기반 입자 분할/병합 알고리즘 개발

Author: 김도현
Publication venue: 서울대학교 대학원
Publication date: 01/08/2021
Field of study

학위논문(석사) -- 서울대학교대학원 : 공과대학 에너지시스템공학부, 2021.8. 김수진.최근 원자력 안전 관련 현안들은 열수력 현상 뿐만이 아닌 핵연료 용융, 구조, 재료, 화학반응, 다상 유동 등을 포함하는 매우 복잡한 현상들로 이루어진다. 전통적인 원자로 안전 해석은 주로 오일러리안 격자 기반의 수치해석 방법에 기반하지만, 최근에는 유체 시스템을 유한 개의 유체 입자의 집합으로 표현하는 라그랑지안 입자 기반 방법론 역시 활발하게 연구되고 있다. 대표적인 라그랑지안 기반 해석 기법인 완화입자유체동역학(Smoothed Particle Hydrodynamics : SPH)은 입자를 직접 추적하는 특성으로 인해 앞서 언급한 복잡한 물리 현상들이 포함하는 자유표면이나 다상 유동 등의 유동적인 계산 영역을 해석하는 데에 용이하다. 입자 기반의 유체 해석에서 높은 해상도는 일반적으로 높은 정확도의 결과를 보장하지만, 이는 해석 영역 내 입자 수의 증가에 따른 높은 계산 부하를 야기한다. 현존하는 대부분의 입자 기반 해석 코드는 계산 영역 전체에서 동일한 크기의 입자를 사용하는 단일 해상도(Single-resolution) 방식을 채택하고 있다. 이러한 방식은 난류 해석, 비등/응축 해석, 충격파 해석 등과 같이 유동 영역에 따라 다른 수준의 입자 해상도를 요구하는 해석의 경우 불필요한 계산 부하가 형성되거나, 오히려 계산의 정확도를 저하시킬 수 있다. 따라서 이를 개선하기 위해 계산 영역 내에서 국부적으로 입자의 크기를 조절할 수 있는 다중 해상도(Multi-resolution) 해석의 도입이 필요하다. 이에 따라 본 연구에서는 SPH 기법을 기반으로 한 입자 분할/병합 방법론(Adaptive Particle Refinement : APR)을 개발하고, 모델의 가속화를 위해 이를 GPU 병렬 계산에 적합한 형태로 구현하였다. APR 방법론의 기본 개념은 계산 중 특정 조건에서 입자를 분할하거나 병합함으로써, 계산 영역 내의 국부적인 영역에서 서로 다른 해상도로 해석을 수행하는 것이다. 입자는 특정 조건(입자의 위치, 부피, 속도 구배 등)에서 여러 개로 분할되거나, 또는 여러 개의 입자가 더 적은 수의 입자로 병합되는 과정을 통해 계산 내에서 다양한 입자 해상도를 구현할 수 있다. 하지만 기존 연구에서 사용된 방식들은 입자를 병합하는 과정에서 병합 입자의 속도가 기존 입자들의 운동량 보존식만으로 결정한다. 이러한 방식은 질량과 운동량을 잘 보존하지만 입자의 운동 에너지를 보존하지 못하기 때문에, 본 논문에서는 운동 에너지 보존을 위한 새로운 병합 모델이 제시되었다. 또한, APR 과정에서 입자의 완화 거리(Smoothing length)를 변화시키는 경우, 서로 다른 크기의 입자가 상호작용하는 해상도의 경계에서 계산의 정확도를 떨어트릴 수 있다. 이를 개선하기 위한 연속적 완화 거리 변화 모델 역시 제안되었다. GPU 병렬 계산의 특성상, 하나의 메모리에 여러 스레드가 동시에 접근하여 연산을 수행할 경우 스레드 간 연산의 순서가 꼬여 기대하던 것과 다른 결과를 도출하는 경쟁 조건이 발생할 수 있다. APR 방법론을 병렬화 할 경우 새롭게 생성되는 입자들을 저장하는 과정에서 이러한 경쟁 조건이 발생하여 생성 입자의 메모리 주소가 충돌하는 현상이 발생한다. 이를 해결하기 위해 CUDA C 언어가 제공하는 원자 연산을 이용하여 스레드 간 계산의 간섭을 방지할 수 있는 잠금 알고리즘을 구현하였고, 과도한 직렬화로 인한 계산 속도 저하를 방지하기 위해 알고리즘을 최적화하였다. 적용된 APR 방법론을 검증하고 성능을 평가하기 위해 다양한 시뮬레이션에 대한 검증 해석이 수행되었다. 정수압 형성, 관내 유동, 댐 붕괴, 그리고 칼만 와류에 대한 해석을 통해 개발된 APR 모델이 안정적으로 다중 해상도를 구현할 수 있음을 확인하였고, 높은 정확도와 계산 효율을 보이는 것을 확인하였다. 또한 제트 파쇄 해석과 공기 방울 상승 해석을 통해 다유체, 다상 유동에의 적용을 수행하였고, 실험 데이터와의 정략적으로 비교하였다. 분석 결과, 시뮬레이션이 실제 현상을 잘 모사함이 확인되었으며, 입자 수 조절을 통해 계산 효율이 크게 향상되었다. 본 연구에서는 SPH 방법론을 기반으로 한 입자 분할/병합 모델을 개발하고 GPU를 이용하여 최적화함으로써, SPH 방법론 내에 다중 해상도 해석 체계를 구축하였다. 이는 기존 단일 해상도의 입자 기반 해석 체계가 필연적으로 가지고 있었던 해상도 증가에 따른 과도한 계산 부하 문제에 대한 해결책을 제시할 수 있다는 점에서 의의를 가지며, 원자로 중대 사고 해석과 같이 현상 내에서 여러 입자 해상도를 요구하는 복잡한 유동에 대한 해석에 기여할 수 있을 것으로 예상된다.Recent nuclear safety issues are not only limited to thermal-hydraulics, but consist of complex phenomena including fuel melt, materials, chemical reactions, and multi-phase flow. The traditional reactor safety analysis is mainly based on Computational Fluid Dynamics(CFD) with Eulerian grid-based methods. But recently, Lagrangian particle-based methods are also being actively studied, due to their well-known advantages in handling free surface, interfacial flow, and large deformation. Smoothed Particle Hydrodynamics (SPH) is one of the representative Lagrangian-based methods in which the fluid system is represented as the finite number of particles. In particle-based CFD, high resolution generally guarantees high-accuracy results, but it causes a high computational load as the number of particles in the domain increases. Most existing particle-based analysis codes adopt a single-resolution method using particles of the same size in the entire computational domain. However, in the case that requires different levels of particle resolution depending on the flow region, such as turbulence, boiling/condensation, and shock wave analysis, this method may create unnecessary computational load or reduce the computational accuracy. Therefore, in order to improve this, it is necessary to introduce a multi-resolution analysis that can control the size of particles locally within the computational domain. Accordingly, in this study, an adaptive particle refinement (APR) method was developed and implemented in SPH, in a form suitable for GPU parallel computation to accelerate the model. The basic concept of the APR methodology is to use different resolutions in localized regions within the computational domain by splitting or merging particles under specific conditions during simulation. Multiple particle resolutions can be implemented by splitting or merging the SPH particles under certain conditions (position, volume, velocity gradient, etc.). However, in the methods used in previous studies, the velocity of the merged particle is determined only by the momentum conservation equation in the process of merging. Since this method conserves mass and momentum well but does not conserve the kinetic energy of particles, a new merging model for kinetic energy conservation is proposed in this study. In addition, when the smoothing length of particles is changed during the APR process, the accuracy of calculation may be reduced at the interface of the resolution where particles of different sizes interact. A continuous smoothing length change model to improve this was also proposed. Due to the nature of GPU parallel computation, when multiple threads simultaneously access and perform operations on the same memory, the order of operations between threads is twisted, resulting in a race condition that yields different results than expected. When the APR methodology is parallelized, such condition occurs in the process of storing newly generated particles, resulting in a collision of memory addresses of generated particles. To solve this problem, a locking algorithm that can prevent inter-thread computational interference was implemented using the atomic operation provided by the CUDA C language, and the algorithm was optimized to prevent computational speed degradation due to excessive serialization. Model validation and performance evaluation were performed with the applied APR model. From the analysis results of hydrostatic pressure formation, pipe flow, dam collapse, and Karman vortex, it was confirmed that the APR model developed can stably implement multi-resolution, and showed high accuracy and computational efficiency. In addition, application to multi-fluid and multi-phase flow was performed through jet break up and air bubble rising simulation, and quantitatively compared with experimental data. As a result, it was confirmed that the model well simulates the real phenomena, and the computational efficiency was greatly improved by controlling the number of particles. In this study, a multi-resolution analysis system was constructed within the SPH methodology by developing a particle refinement model and optimizing it using a GPU. This is significant in that it can provide a solution to the problem of excessive computational load due to the increase in resolution that the existing single-resolution particle-based analysis system inevitably had. It is expected to contribute to the analysis of complex flows that require multiple particle resolutions within the phenomenon, such as severe accidents in a nuclear reactor.Chapter 1 Introduction 1 1.1 Background and Motivation 1 1.2 Previous Studies 2 1.3 Objectives 4 Chapter 2 Smoothed Particle Hydrodynamics 6 2.1 Smoothed Particle Hydrodynamics (SPH) 6 2.1.1 SPH Particle Approximation 6 2.1.2 Smoothing Kernel Function 8 2.1.3 SPH approximation of derivatives 9 2.2 SPH governing equations 10 2.2.1 Mass conservation 11 2.2.2 Momentum conservation 12 2.2.3 Equation of State 13 2.2.4 Surface tension 13 2.3 SPH Algorithm 14 Chapter 3 Adaptive Particle Refinement 22 3.1 Adaptive Particle Refinement (APR) 22 3.1.1 Basic concept of APR 22 3.1.2 APR methodologies 23 3.1.3 Kinetic Energy Conservation 25 3.1.4 Error Analysis 27 3.1.5 Variable smoothing length 28 3.2 GPU-Parallelization 30 3.2.1 GPU-based SPH Algorithm 30 3.2.2 APR data management 30 3.2.3 Race Condition and Atomic 31 3.2.4 GPU-based APR Algorithm 33 Chapter 4 Results & Discussions 51 4.1 Benchmark Simulation 51 4.1.1 Hydrostatic Pressure 51 4.1.2 Pipe Flow 52 4.1.3 Dam Break 53 4.1.4 Karman Vortex 54 4.2 Application 55 4.2.1 Jet Break-up 55 4.2.2 Single Bubble Rising 57 Chapter 5 Conclusion 81 5.1 Summary 81 5.2 Recommendations 82 Nomenclature 84 References 86 국문 초록 91석

SNU Open Repository and Archive

The Astrophysical Multipurpose Software Environment

Author: de Vries N.
Drost N.
McMillan S. L. W.
Pelupessy F. I.
van Elteren A.
Zwart S. F. Portegies
Publication venue: 'EDP Sciences'
Publication date: 01/01/2013
Field of study

We present the open source Astrophysical Multi-purpose Software Environment (AMUSE, www.amusecode.org), a component library for performing astrophysical simulations involving different physical domains and scales. It couples existing codes within a Python framework based on a communication layer using MPI. The interfaces are standardized for each domain and their implementation based on MPI guarantees that the whole framework is well-suited for distributed computation. It includes facilities for unit handling and data storage. Currently it includes codes for gravitational dynamics, stellar evolution, hydrodynamics and radiative transfer. Within each domain the interfaces to the codes are as similar as possible. We describe the design and implementation of AMUSE, as well as the main components and community codes currently supported and we discuss the code interactions facilitated by the framework. Additionally, we demonstrate how AMUSE can be used to resolve complex astrophysical problems by presenting example applications.Comment: 23 pages, 25 figures, accepted for A&

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Leiden University Scholary Publications

A GPU-accelerated package for simulation of flow in nanoporous source rocks with many-body dissipative particle dynamics

Author: Andrew Matthew
Blumers Ansel
Deo Milind
Goral Jan
Huang Hai
Kane Joshua
Li Zhen
Luo Lixiang
Tang Yu-Hang
Xia Yidong
Publication venue: 'Elsevier BV'
Publication date: 25/03/2019
Field of study

Mesoscopic simulations of hydrocarbon flow in source shales are challenging, in part due to the heterogeneous shale pores with sizes ranging from a few nanometers to a few micrometers. Additionally, the sub-continuum fluid-fluid and fluid-solid interactions in nano- to micro-scale shale pores, which are physically and chemically sophisticated, must be captured. To address those challenges, we present a GPU-accelerated package for simulation of flow in nano- to micro-pore networks with a many-body dissipative particle dynamics (mDPD) mesoscale model. Based on a fully distributed parallel paradigm, the code offloads all intensive workloads on GPUs. Other advancements, such as smart particle packing and no-slip boundary condition in complex pore geometries, are also implemented for the construction and the simulation of the realistic shale pores from 3D nanometer-resolution stack images. Our code is validated for accuracy and compared against the CPU counterpart for speedup. In our benchmark tests, the code delivers nearly perfect strong scaling and weak scaling (with up to 512 million particles) on up to 512 K20X GPUs on Oak Ridge National Laboratory's (ORNL) Titan supercomputer. Moreover, a single-GPU benchmark on ORNL's SummitDev and IBM's AC922 suggests that the host-to-device NVLink can boost performance over PCIe by a remarkable 40\%. Lastly, we demonstrate, through a flow simulation in realistic shale pores, that the CPU counterpart requires 840 Power9 cores to rival the performance delivered by our package with four V100 GPUs on ORNL's Summit architecture. This simulation package enables quick-turnaround and high-throughput mesoscopic numerical simulations for investigating complex flow phenomena in nano- to micro-porous rocks with realistic pore geometries

arXiv.org e-Print Archive

Clemson University: TigerPrints

Flood modelling with hydraTE: 2+1-dimensional smoothed-particle hydrodynamics

Author: Pearce Frazer R
Thomas Peter A
Publication venue: 'University of Sussex'
Publication date: 14/05/2014
Field of study

We present HydraTE, our own implementation of the smoothed-particle hydrodynamics technique for shallow water that uses the adaptive size of the smoothing kernel as a proxy for the local water depth. We derive the equa- tions of motion for this approach from the Lagrangian before demonstrating that we can model the depth of water in a trough, implement vertical walls, recover the correct acceleration and terminal velocity for water flowing down a slope and obtain a stable hydraulic jump with the correct jump condition. We demonstrate that HydraTE performs well on two of the UK Environ- ment Agency flood modelling benchmark tests. Benchmark EA3 involves flow down an incline into a double dip depression and studies the amount of water that reaches the second dip. Our results are in agreement with those of the other codes that have attempted this test. Benchmark EA6 is a dam break into a horizontal channel containing a building. HydraTE again pro- duces results that are in good agreement with the other methods and the experimetal validation data except where the vertical velocity structure of the flow is expected to be multi-valued, such as the hydralic jump where the precise location is not recovered even though the pre- and post- jump water heights are. We conclude that HydraTE is suitable for a wide range of flood modelling problems as it preforms at least as well as the best available commercial alternatives for the problems we have tested

Sussex Research Online

Mesh-free hydrodynamics in PKDGRAV3 for galaxy formation simulations

Author: Asensio Isaac Alonso
Potter Douglas
Stadel Joachim
Vecchia Claudio Dalla
Publication venue
Publication date: 22/11/2022
Field of study

We extend the state-of-the-art N-body code PKDGRAV3 with the inclusion of mesh-free gas hydrodynamics for cosmological simulations. Two new hydrodynamic solvers have been implemented, the mesh-less finite volume and mesh-less finite mass methods. The solvers manifestly conserve mass, momentum and energy, and have been validated with a wide range of standard test simulations, including cosmological simulations. We also describe improvements to PKDGRAV3 that have been implemented for performing hydrodynamic simulations. These changes have been made with efficiency and modularity in mind, and provide a solid base for the implementation of the required modules for galaxy formation and evolution physics and future porting to GPUs. The code is released in a public repository, together with the documentation and all the test simulations presented in this work.Comment: 18 pages, 14 figures; accepted for publication in MNRA

arXiv.org e-Print Archive

Ray-traced radiative transfer on massively threaded architectures

Author: Thomson Samuel Paul
Publication venue: The University of Edinburgh
Publication date: 02/07/2018
Field of study

In this thesis, I apply techniques from the field of computer graphics to ray tracing in astrophysical simulations, and introduce the grace software library. This is combined with an extant radiative transfer solver to produce a new package, taranis. It allows for fully-parallel particle updates via per-particle accumulation of rates, followed by a forward Euler integration step, and is manifestly photon-conserving. To my knowledge, taranis is the first ray-traced radiative transfer code to run on graphics processing units and target cosmological-scale smooth particle hydrodynamics (SPH) datasets. A significant optimization effort is undertaken in developing grace. Contrary to typical results in computer graphics, it is found that the bounding volume hierarchies (BVHs) used to accelerate the ray tracing procedure need not be of high quality; as a result, extremely fast BVH construction times are possible (< 0.02 microseconds per particle in an SPH dataset). I show that this exceeds the performance researchers might expect from CPU codes by at least an order of magnitude, and compares favourably to a state-of-the-art ray tracing solution. Similar results are found for the ray-tracing itself, where again techniques from computer graphics are examined for effectiveness with SPH datasets, and new optimizations proposed. For high per-source ray counts (≳ 104), grace can reduce ray tracing run times by up to two orders of magnitude compared to extant CPU solutions developed within the astrophysics community, and by a factor of a few compared to a state-of-the-art solution. taranis is shown to produce expected results in a suite of de facto cosmological radiative transfer tests cases. For some cases, it currently out-performs a serial, CPU-based alternative by a factor of a few. Unfortunately, for the most realistic test its performance is extremely poor, making the current taranis code unsuitable for cosmological radiative transfer. The primary reason for this failing is found to be a small minority of particles which always dominate the timestep criteria. Several plausible routes to mitigate this problem, while retaining parallelism, are put forward

Edinburgh Research Archive

AN ADAPTIVE SAMPLING APPROACH TO INCOMPRESSIBLE PARTICLE-BASED FLUID

Author: Hong Woo-Suck
Publication venue
Publication date: 16/01/2010
Field of study

I propose a particle-based technique for simulating incompressible uid that includes adaptive re nement of particle sampling. Each particle represents a mass of uid in its local region. Particles are split into several particles for ner sampling in regions of complex ow. In regions of smooth ow, neghboring particles can be merged. Depth below the surface and Reynolds number are exploited as our criteria for determining whether splitting or merging should take place. For the uid dynamics calculations, I use the hybrid FLIP method, which is computationally simple and e cient. Since the uid is incompressible, each particle has a volume proportional to its mass. A kernel function, whose e ective range is based on this volume, is used for transferring and updating the particle's physical properties such as mass and velocity. In addition, the particle sampling technique is extended to a fully adaptive approach, supporting adaptive splitting and merging of uid particles and adaptive spatial sampling for the reconstruction of the velocity and pressure elds. Particle splitting allows a detailed sampling of uid momentum in regions of complex ow. Particle merging, in regions of smooth ow, reduces memory and computational overhead. An octree structure is used to compute inter-particle interactions and to compute the pressure eld. The octree supporting eld-based calculations is adapted to provide a ne spatial reconstruction where particles are small and a coarse reconstruction where particles are large. This scheme places computational resources where they are most needed, to handle both ow and surface complexity. Thus, incompressibility can be enforced even in very small, but highly turbulent areas. Simultaneously, the level of detail is very high in these areas, allowing the direct support of tiny splashes and small-scale surface tension e ects. This produces a nely detailed and realistic representation of surface motion

Texas A&M Repository