693 research outputs found
Report from the MPP Working Group to the NASA Associate Administrator for Space Science and Applications
NASA's Office of Space Science and Applications (OSSA) gave a select group of scientists the opportunity to test and implement their computational algorithms on the Massively Parallel Processor (MPP) located at Goddard Space Flight Center, beginning in late 1985. One year later, the Working Group presented its report, which addressed the following: algorithms, programming languages, architecture, programming environments, the way theory relates, and performance measured. The findings point to a number of demonstrated computational techniques for which the MPP architecture is ideally suited. For example, besides executing much faster on the MPP than on conventional computers, systolic VLSI simulation (where distances are short), lattice simulation, neural network simulation, and image problems were found to be easier to program on the MPP's architecture than on a CYBER 205 or even a VAX. The report also makes technical recommendations covering all aspects of MPP use, and recommendations concerning the future of the MPP and machines based on similar architectures, expansion of the Working Group, and study of the role of future parallel processors for space station, EOS, and the Great Observatories era
LagrangeBench: A Lagrangian Fluid Mechanics Benchmarking Suite
Machine learning has been successfully applied to grid-based PDE modeling in
various scientific applications. However, learned PDE solvers based on
Lagrangian particle discretizations, which are the preferred approach to
problems with free surfaces or complex physics, remain largely unexplored. We
present LagrangeBench, the first benchmarking suite for Lagrangian particle
problems, focusing on temporal coarse-graining. In particular, our contribution
is: (a) seven new fluid mechanics datasets (four in 2D and three in 3D)
generated with the Smoothed Particle Hydrodynamics (SPH) method including the
Taylor-Green vortex, lid-driven cavity, reverse Poiseuille flow, and dam break,
each of which includes different physics like solid wall interactions or free
surface, (b) efficient JAX-based API with various recent training strategies
and three neighbor search routines, and (c) JAX implementation of established
Graph Neural Networks (GNNs) like GNS and SEGNN with baseline results. Finally,
to measure the performance of learned surrogates we go beyond established
position errors and introduce physical metrics like kinetic energy MSE and
Sinkhorn distance for the particle distribution. Our codebase is available at
https://github.com/tumaer/lagrangebench .Comment: Accepted at 37th Conference on Neural Information Processing Systems
(NeurIPS 2023) Track on Datasets and Benchmark
A GPU-ACCELERATED, HYBRID FVM-RANS METHODOLOGY FOR MODELING ROTORCRAFT BROWNOUT
A numerically effecient, hybrid Eulerian-
Lagrangian methodology has been developed to
help better understand the complicated two-
phase flowfield encountered in rotorcraft
brownout environments. The problem of brownout
occurs when rotorcraft operate close to
surfaces covered with loose particles such as
sand, dust or snow. These particles can get
entrained, in large quantities, into the rotor
wake leading to a potentially hazardous
degradation of the pilots visibility. It is
believed that a computationally efficient model
of this phenomena, validated against available
experimental measurements, can be a used as a
valuable tool to reveal the underlying physics
of rotorcraft brownout. The present work
involved the design, development and validation
of a hybrid solver for the purpose of modeling
brownout-like environments. The proposed
methodology combines the numerical efficiency
of a free-vortex method with the relatively
high-fidelity of a 3D, time-accurate, Reynolds-
averaged, Navier-Stokes (RANS) solver. For
dual-phase simulations, this hybrid method can
be unidirectionally coupled with a sediment
tracking algorithm to study cloud development.
In the past, large clusters of CPUs have been
the standard approach for large simulations
involving the numerical solution of PDEs. In
recent years, however, an emerging trend is the
use of Graphics Processing Units (GPUs), once
used only for graphics rendering, to perform
scientific computing. These platforms deliver
superior computing power and memory bandwidth
compared to traditional CPUs and their prowess
continues to grow rapidly with each passing
generation. CFD simulations have been ported
successfully onto GPU platforms in the past.
However, the nature of GPU architecture has
restricted the set of algorithms that exhibit
significant speedups on these platforms - GPUs
are optimized for operations where a massively
large number of threads, relative to the
problem size, are working in parallel,
executing identical instructions on disparate
datasets. For this reason, most implementations
in the scientific literature involve the use of
explicit algorithms for time-stepping,
reconstruction, etc. To overcome the difficulty
associated with implicit methods, the current
work proposes a multi-granular approach to
reduce performance penalties typically
encountered with such schemes. To explore the
use of GPUs for RANS simulations, a 3D, time-
accurate, implicit, structured, compressible,
viscous, turbulent, finite-volume RANS solver
was designed and developed in CUDA-C. During
the development phase, various strategies for
performance optimization were used to make the
implementation better suited to the GPU
architecture. Validation and verification of
the GPU-based solver was performed for both
canonical and realistic bench-mark problems on
a variety of GPU platforms. In these test-
cases, a performance assessment of the GPU-RANS
solver indicated that it was between one and
two orders of magnitude faster than equivalent
single CPU core computations ( as high as 50X
for fine-grain computations on the latest
platforms). For simulations involving implicit
methods, a multi-granular technique was used
that sought to exploit the intermediate coarse-
grain parallelism inherent in families of line-
parallel methods like Alternating Direction
Implicit (ADI) schemes coupled with con-
servative variable parallelism. This approach
had the dual effect of reducing memory
bandwidth usage as well as increasing GPU
occupancy leading to significant performance
gains. The multi-granular approach for implicit
methods used in this work has demonstrated
speedups that are close to 50% of those
expected with purely explicit methods. The
validated GPU-RANS solver was then coupled with
GPU-based free-vortex and sediment tracking
methods to model single and dual-phase, model-
scale brownout environments. A qualitative and
quantitative validation of the methodology was
performed by comparing predictions with
available measurements, including flowfield
measurements and observations of particle
transport mechanisms that have been made with
laboratory-scale rotor/jet configurations in
ground effect. In particular, dual-phase
simulations were able to resolve key transport
phenomena in the dispersed phase such as creep,
vortex trapping and sediment wave formation.
Furthermore, these simulations were
demonstrated to be computationally more
efficient than equivalent computations on a
cluster of traditional CPUs - a model-scale
brownout simulation using the hybrid approach
on a single GTX Titan now takes 1.25 hours per
revolution compared to 6 hours per revolution
on 32 Intel Xeon cores
An Implementation of the Discontinuous Galerkin Method on Graphics Processing Units
Computing highly-accurate approximate solutions to partial differential equations (PDEs) requires both a robust numerical method and a powerful machine. We present a parallel implementation of the discontinuous Galerkin (DG) method on graphics processing units (GPUs). In addition to being flexible and highly accurate, DG methods accommodate parallel architectures well, as their discontinuous nature produces entirely element-local approximations.
While GPUs were originally intended to compute and display computer graphics, they have recently become a popular general purpose computing device. These cheap and extremely powerful devices have a massively parallel structure. With the recent addition of double precision floating point number support, GPUs have matured as serious platforms for parallel scientific computing.
In this thesis, we present an implementation of the DG method applied to systems of hyperbolic conservation laws in two dimensions on a GPU using NVIDIA’s Compute Unified Device Architecture (CUDA). Numerous computed examples from linear advection to the Euler equations demonstrate the modularity and usefulness of our implementation. Benchmarking our method against a single core, serial implementation of the DG method reveals a speedup of a factor of over fifty times using a USD $500.00 NVIDIA GTX 580
A numerical method for fluid-structure interactions of slender rods in turbulent flow
This thesis presents a numerical method for the simulation of fluid-structure interaction (FSI) problems on high-performance computers. The proposed method is specifically tailored to interactions between Newtonian fluids and a large number of slender viscoelastic structures, the latter being modeled as Cosserat rods. From a numerical point of view, such kind of FSI requires special techniques to reach numerical stability. When using a partitioned fluid-structure coupling approach
this is usually achieved by an iterative procedure, which drastically increases the computational effort. In the present work, an alternative coupling approach is developed based on an immersed boundary method (IBM). It is unconditionally
stable and exempt from any global iteration between the fluid part and the structure part.
The proposed FSI solver is employed to simulate the flow over a dense layer of vegetation elements, usually designated as canopy flow. The abstracted canopy model used in the simulation consists of 800 strip-shaped blades, which is the
largest canopy-resolving simulation of this type done so far. To gain a deeper understanding of the physics of aquatic canopy flows the simulation data obtained are analyzed, e.g., concerning the existence and shape of coherent structures
A Coupled Lattice Boltzmann-Extended Finite Element Model for Fluid-Structure Interaction Simulation with Crack Propagation
Fatigue cracking of structures in fluid-structure interaction (FSI) applications is a pervasive issue that impacts a broad spectrum of engineering activities, ranging from large-scale ocean engineering and aerospace structures to bio-medical prosthetics. Fatigue is a particular concern in the offshore drilling industry where the problem is exacerbated by environmental degradation, and where structural failure can have substantial financial and environmental ramifications. As a result, interest has grown for the development of structural health monitoring (SHM) schemes for FSI applications that promote early damage detection. FSI simulation provides a practical and efficient means for evaluating and training SHM approaches for FSI applications, and for improving fatigue life predictions through robust parametric studies that address uncertainties in both crack propagation and FSI response. To this end, this paper presents a numerical modeling approach for simulating FSI response with crack propagation. The modeling approach couples a massively parallel lattice Boltzmann fluid solver, executed on a graphics processing unit (GPU) device, with an extended finite element (XFE) solid solver. Two-way interaction is provided by an immersed boundary coupling scheme, in which a Lagrangian solid mesh moves on top of a fixed Eulerian fluid grid. The theoretical basis and numerical implementation of the modeling approach are presented, along with a simple demonstration problem involving subcritical crack growth in a flexible beam subject to vortex-induced vibration
- …