Search CORE

144 research outputs found

Speeding up the high-accuracy surface modelling method with GPU

Author: A Quarteroni
AJ Kalyanapu
BA Bryan
C Chen
C Chen
C Chen
C Qin
C Yan
Changqing Yan
Chuanfa Chen
D Tristram
DW Henderson
G Blöschl
G Wu
G Zhao
G Zhao
G Zhao
G Zhao
Gang Zhao
Han Li
JE Stone
Jimin Liu
L Li
M Abouali
M Steinbach
MS Alkhasawneh
N Stojanovic
Na Su
NVIDIA
NVIDIA
NVIDIA
P Afrasiab
R Herzog
S Erdogan
S Tomov
SJ Jeffrey
T Marke
T Preis
T Rauber
T Yue
Tianxiang Yue
TX Yue
TX Yue
TX Yue
TX Yue
TX Yue
TX Yue
VA Toponogov
W Shi
Y Saad
Y Xia
Y Xu
ZP Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Large-scale spatial data processing on GPUs and GPU-accelerated clusters

Author: Hennessy J. L.
Jianting Zhang
Kirk D. B.
Kornacker M.
Le Gruenwald
McCool M.
Samet H.
Simin You
Theobald D. M.
Zhang J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Ein Gas-Kinetic Scheme Ansatz zur Modellierung und Simulation von Feuer auf massiv paralleler Hardware

Author: Lenz Stephan
Publication venue
Publication date: 01/01/2020
Field of study

This work presents a simulation approach based on a Gas Kinetic Scheme (GKS) for the simulation of fire that is implemented on massively parallel hardware in terms of Graphics Processing Units (GPU) in the framework of General Purpose computing on Graphics Processing Units (GPGPU). Gas kinetic schemes belong to the class of kinetic methods because their governing equation is the mesoscopic Boltzmann equation, rather than the macroscopic Navier-Stokes equations. Formally, kinetic methods have the advantage of a linear advection term which simplifies discretization. GKS inherently contains the full energy equation which is required for compressible flows. GKS provides a flux formulation derived from kinetic theory and is usually implemented as a finite volume method on cell-centered grids. In this work, we consider an implementation on nested Cartesian grids. To that end, a coupling algorithm for uniform grids with varying resolution was developed and is presented in this work. The limitation to local uniform Cartesian grids allows an efficient implementation on GPUs, which belong to the class of many core processors, i.e. massively parallel hardware. Multi-GPU support is also implemented and efficiency is enhanced by communication hiding. The fluid solver is validated for several two- and three-dimensional test cases including natural convection, turbulent natural convection and turbulent decay. It is subsequently applied to a study of boundary layer stability of natural convection in a cavity with differentially heated walls and large temperature differences. The fluid solver is further augmented by a simple combustion model for non-premixed flames. It is validated by comparison to experimental data for two different fire plumes. The results are further compared to the industry standard for fire simulation, i.e. the Fire Dynamics Simulator (FDS). While the accuracy of GKS appears slightly reduced as compared to FDS, a substantial speedup in terms of time to solution is found. Finally, GKS is applied to the simulation of a compartment fire. This work shows that the GKS has a large potential for efficient high performance fire simulations.Diese Arbeit präsentiert einen Simulationsansatz basierend auf einer gaskinetischen Methode (eng. Gas Kinetic Scheme, GKS) zur Simulation von Bränden, welcher für massiv parallel Hardware im Sinne von Grafikprozessoren (eng. Graphics Processing Units, GPUs) implementiert wurde. GKS gehört zur Klasse der kinetischen Methoden, die nicht die makroskopischen Navier-Stokes Gleichungen, sondern die mesoskopische Boltzmann Gleichung lösen. Formal haben kinetische Methoden den Vorteil, dass der Advektionsterms linear ist. Dies vereinfacht die Diskretisierung. In GKS ist die vollständige Energiegleichung, die zur Lösung kompressibler Strömungen benötigt wird, enthalten. GKS formuliert den Fluss von Erhaltungsgrößen basierend auf der gaskinetischen Theorie und wird meistens im Rahmen der Finiten Volumen Methode umgesetzt. In dieser Arbeit betrachten wir eine Implementierung auf gleichmäßigen Kartesischen Gittern. Dazu wurde ein Kopplungsalgorithmus für die Kombination von Gittern unterschiedlicher Auflösung entwickelt. Die Einschränkung auf lokal gleichmäßige Gitter erlaubt eine effiziente Implementierung auf GPUs, welche zur Klasse der massiv parallelen Hardware gehören. Des Weiteren umfasst die Implementierung eine Unterstützung für Multi-GPU mit versteckter Kommunikation. Der Strömungslöser ist für zwei und dreidimensionale Testfälle validiert. Dabei reichen die Tests von natürlicher Konvektion über turbulente Konvektion bis hin zu turbulentem Zerfall. Anschließend wird der Löser genutzt um die Grenzschichtstabilität in natürlicher Konvektion bei großen Temperaturunterschieden zu untersuchen. Darüber hinaus umfasst der Löser ein einfaches Verbrennungsmodell für Diffusionsflammen. Dieses wird durch Vergleich mit experimentellen Feuern validiert. Außerdem werden die Ergebnisse mit dem gängigen Brandsimulationsprogramm FDS (eng. Fire Dynamics Simulator) verglichen. Die Qualität der Ergebnisse ist dabei vergleichbar, allerdings ist der in dieser Arbeit entwickelte Löser deutlich schneller. Anschließend wird das GKS noch für die Simulation eines Raumbrandes angewendet. Diese Arbeit zeigt, dass GKS ein großes Potential für die Hochleistungssimulation von Feuer hat

Digitale Bibliothek Braunschweig

Interstitial-Scale Modeling of Packed-Bed Reactors

Author: Combest Daniel Parks
Publication venue: Washington University Open Scholarship
Publication date: 29/08/2012
Field of study

Packed-beds are common to adsorption scrubbers, packed bed reactors, and trickle-bed reactors widely used across the petroleum, petrochemical, and chemical industries. The micro structure of these packed beds is generally very complex and has tremendous influence on heat, mass, and momentum transport phenomena on the micro and macro length scales within the bed. On a reactor scale, bed geometry strongly influences overall pressure drop, residence time distribution, and conversion of species through domain-fluid interactions. On the interstitial scale, particle boundary layer formation, fluid to particle mass transfer, and local mixing are controlled by turbulence and dissipation existing around packed particles. In the present research, a CFD model is developed using OpenFOAM: www.openfoam.org) to directly resolve momentum and scalar transport in both laminar and turbulent flow-fields, where the interstitial velocity field is resolved using the Navier-Stokes equations: i.e. no pseudo-continuum based assumptions. A discussion detailing the process of generating the complex domain using a Monte-Carlo packing algorithm is provided, along with relevant details required to generate an arbitrary polyhedral mesh describing the packed-bed. Lastly, an algorithm coupling OpenFOAM with a linear system solver using the graphics processing unit: GPU) computing paradigm was developed and will be discussed in detail

Washington University St. Louis: Open Scholarship

Automatische Codegenerierung für Massiv Parallele Applikationen in der Numerischen Strömungsmechanik

Author: Kuckuk Sebastian
Publication venue
Publication date: 01/01/2019
Field of study

Solving partial differential equations (PDEs) is a fundamental challenge in many application domains in industry and academia alike. With increasingly large problems, efficient and highly scalable implementations become more and more crucial. Today, facing this challenge is more difficult than ever due to the increasingly heterogeneous hardware landscape. One promising approach is developing domain‐specific languages (DSLs) for a set of applications. Using code generation techniques then allows targeting a range of hardware platforms while concurrently applying domain‐specific optimizations in an automated fashion. The present work aims to further the state of the art in this field. As domain, we choose PDE solvers and, in particular, those from the group of geometric multigrid methods. To avoid having a focus too broad, we restrict ourselves to methods working on structured and patch‐structured grids. We face the challenge of handling a domain as complex as ours, while providing different abstractions for diverse user groups, by splitting our external DSL ExaSlang into multiple layers, each specifying different aspects of the final application. Layer 1 is designed to resemble LaTeX and allows inputting continuous equations and functions. Their discretization is expressed on layer 2. It is complemented by algorithmic components which can be implemented in a Matlab‐like syntax on layer 3. All information provided to this point is summarized on layer 4, enriched with particulars about data structures and the employed parallelization. Additionally, we support automated progression between the different layers. All ExaSlang input is processed by our jointly developed Scala code generation framework to ultimately emit C++ code. We particularly focus on how to generate applications parallelized with, e.g., MPI and OpenMP that are able to run on workstations and large‐scale cluster alike. We showcase the applicability of our approach by implementing simple test problems, like Poisson’s equation, as well as relevant applications from the field of computational fluid dynamics (CFD). In particular, we implement scalable solvers for the Stokes, Navier‐Stokes and shallow water equations (SWE) discretized using finite differences (FD) and finite volumes (FV). For the case of Navier‐Stokes, we also extend our implementation towards non‐uniform grids, thereby enabling static mesh refinement, and advanced effects such as the simulated fluid being non‐Newtonian and non‐isothermal

OpenFPM: A scalable environment for particle and particle-mesh codes on parallel computers

Author: Incardona Pietro
Publication venue
Publication date: 30/08/2022
Field of study

Scalable and efficient numerical simulations continue to gain importance, as computation is firmly established tool of discovery, together with theory and experiment. Meanwhile, the performance of computing hardware grows with increasing heterogeneous hardware, enabling simulations of ever more complex models. However, efficiently implementing scalable codes on heterogeneous, distributed hardware systems becomes the bottleneck. This bottleneck can be alleviated by intermediate software layers that provide higher-level abstractions closer to the problem domain, hence allowing the computational scientist to focus on the simulation. Here, we present OpenFPM, an open and scalable framework that provides an abstraction layer for numerical simulations using particles and/or meshes. OpenFPM provides transparent and scalable infrastructure for shared-memory and distributed-memory implementations of particles-only and hybrid particle-mesh simulations of both discrete and continuous models, as well as non-simulation codes. This infrastructure is complemented with frequently used numerical routines, as well as interfaces to third-party libraries. This thesis will present the architecture and design of OpenFPM, detail the underlying abstractions, and benchmark the framework in applications ranging from Smoothed-Particle Hydrodynamics (SPH) to Molecular Dynamics (MD), Discrete Element Methods (DEM), Vortex Methods, stencil codes, high-dimensional Monte Carlo sampling (CMA-ES), and Reaction-Diffusion solvers, comparing it to the current state of the art and existing software frameworks

Technische Universität Dresden: Qucosa

OpenLB User Guide: Associated with Release 1.6 of the Code

Author: Avis Samuel J.
Bukreev Fedor
Crocoll Michael
Dapelo Davide
Großmann Simon
Hafen Nicolas
Ito Shota
Jeßberger Julius
Krause Mathias J.
Kummer Eliane
Kummerländer Adrian
Kusumaatmaja Halim
Marquardt Jan E.
Mödl Johanna
Pertzel Tim
Prinz František
Raichle Florian
Sadric Martin
Schecher Maximilian
Simonis Stephan
Teutscher Dennis
Publication venue
Publication date: 17/05/2023
Field of study

OpenLB is an object-oriented implementation of LBM. It is the first implementation of a generic platform for LBM programming, which is shared with the open source community (GPLv2). Since the first release in 2007, the code has been continuously improved and extended which is documented by thirteen releases as well as the corresponding release notes which are available on the OpenLB website (https://www.openlb.net). The OpenLB code is written in C++ and is used by application programmers as well as developers, with the ability to implement custom models OpenLB supports complex data structures that allow simulations in complex geometries and parallel execution using MPI, OpenMP and CUDA on high-performance computers. The source code uses the concepts of interfaces and templates, so that efficient, direct and intuitive implementations of the LBM become possible. The efficiency and scalability has been checked and proved by code reviews. This user manual and a source code documentation by DoxyGen are available on the OpenLB project website

arXiv.org e-Print Archive

Parallel Mesh Processing

Author: Derzapf Evgenij
Publication venue: Philipps-Universität Marburg
Publication date: 01/01/2012
Field of study

Die aktuelle Forschung im Bereich der Computergrafik versucht den zunehmenden Ansprüchen der Anwender gerecht zu werden und erzeugt immer realistischer wirkende Bilder. Dementsprechend werden die Szenen und Verfahren, die zur Darstellung der Bilder genutzt werden, immer komplexer. So eine Entwicklung ist unweigerlich mit der Steigerung der erforderlichen Rechenleistung verbunden, da die Modelle, aus denen eine Szene besteht, aus Milliarden von Polygonen bestehen können und in Echtzeit dargestellt werden müssen. Die realistische Bilddarstellung ruht auf drei Säulen: Modelle, Materialien und Beleuchtung. Heutzutage gibt es einige Verfahren für effiziente und realistische Approximation der globalen Beleuchtung. Genauso existieren Algorithmen zur Erstellung von realistischen Materialien. Es gibt zwar auch Verfahren für das Rendering von Modellen in Echtzeit, diese funktionieren aber meist nur für Szenen mittlerer Komplexität und scheitern bei sehr komplexen Szenen. Die Modelle bilden die Grundlage einer Szene; deren Optimierung hat unmittelbare Auswirkungen auf die Effizienz der Verfahren zur Materialdarstellung und Beleuchtung, so dass erst eine optimierte Modellrepräsentation eine Echtzeitdarstellung ermöglicht. Viele der in der Computergrafik verwendeten Modelle werden mit Hilfe der Dreiecksnetze repräsentiert. Das darin enthaltende Datenvolumen ist enorm, um letztlich den Detailreichtum der jeweiligen Objekte darstellen bzw. den wachsenden Realitätsanspruch bewältigen zu können. Das Rendern von komplexen, aus Millionen von Dreiecken bestehenden Modellen stellt selbst für moderne Grafikkarten eine große Herausforderung dar. Daher ist es insbesondere für die Echtzeitsimulationen notwendig, effiziente Algorithmen zu entwickeln. Solche Algorithmen sollten einerseits Visibility Culling1, Level-of-Detail, (LOD), Out-of-Core Speicherverwaltung und Kompression unterstützen. Anderseits sollte diese Optimierung sehr effizient arbeiten, um das Rendering nicht noch zusätzlich zu behindern. Dies erfordert die Entwicklung paralleler Verfahren, die in der Lage sind, die enorme Datenflut effizient zu verarbeiten. Der Kernbeitrag dieser Arbeit sind neuartige Algorithmen und Datenstrukturen, die speziell für eine effiziente parallele Datenverarbeitung entwickelt wurden und in der Lage sind sehr komplexe Modelle und Szenen in Echtzeit darzustellen, sowie zu modellieren. Diese Algorithmen arbeiten in zwei Phasen: Zunächst wird in einer Offline-Phase die Datenstruktur erzeugt und für parallele Verarbeitung optimiert. Die optimierte Datenstruktur wird dann in der zweiten Phase für das Echtzeitrendering verwendet. Ein weiterer Beitrag dieser Arbeit ist ein Algorithmus, welcher in der Lage ist, einen sehr realistisch wirkenden Planeten prozedural zu generieren und in Echtzeit zu rendern

Publikations- und Dokumentenserver der Universitätsbibliothek Marburg