Search CORE

312 research outputs found

Extending a serial 3D two-phase CFD code to parallel execution over MPI by using the PETSc library for domain decomposition

Author: Ervik Åsmund
Munkejord Svend Tollak
Müller Bernhard
Publication venue
Publication date: 15/05/2014
Field of study

To leverage the last two decades' transition in High-Performance Computing (HPC) towards clusters of compute nodes bound together with fast interconnects, a modern scalable CFD code must be able to efficiently distribute work amongst several nodes using the Message Passing Interface (MPI). MPI can enable very large simulations running on very large clusters, but it is necessary that the bulk of the CFD code be written with MPI in mind, an obstacle to parallelizing an existing serial code. In this work we present the results of extending an existing two-phase 3D Navier-Stokes solver, which was completely serial, to a parallel execution model using MPI. The 3D Navier-Stokes equations for two immiscible incompressible fluids are solved by the continuum surface force method, while the location of the interface is determined by the level-set method. We employ the Portable Extensible Toolkit for Scientific Computing (PETSc) for domain decomposition (DD) in a framework where only a fraction of the code needs to be altered. We study the strong and weak scaling of the resulting code. Cases are studied that are relevant to the fundamental understanding of oil/water separation in electrocoalescers.Comment: 8 pages, 6 figures, final version for to the CFD 2014 conferenc

arXiv.org e-Print Archive

CiteSeerX

Detailed Simulation of the Cochlea: Recent Progress Using Large Shared Memory Parallel Computers

Author: Bunn J.
Givelberg E.
Rajan M.
Publication venue: 'California Institute of Technology Library'
Publication date: 01/01/2001
Field of study

We have developed and are refining a detailed three-dimensional computational model of the human cochlea. The model uses the immersed boundary method to calculate the fluid-structure interactions produced in response to incoming sound waves. An accurate cochlear geometry obtained from physical measurements is incorporated. The model includes a detailed and realistic description of the various elastic structures present. Initially, a macro-mechanical computational model was developed for execution on a CRAY T90 at the San Diego Supercomputing Center. This code was ported to the latest generation of shared memory high performance servers from Hewlett Packard. Using compiler generated threads and OpenMP directives, we have achieved a high degree of parallelism in the executable, which has made possible to run several large scale numerical simulation experiments to study the interesting features of the cochlear system. In this paper, we outline the methods, algorithms and software tools that were used to implement and fine tune the code, and discuss some of the simulation results

Caltech Authors

Towards Efficient and Scalable Discontinuous Galerkin Methods for Unsteady Flows

Author
Publication venue: Università Politecnica delle Marche
Publication date: 05/02/2019
Field of study

openNegli ultimi anni, la crescente disponibilit`a di risorse computazionali ha contribuito alla diffusione della fluidodinamica computazionale per la ricerca e per la progettazione industriale. Uno degli approcci pi promettenti si basa sul metodo agli elementi finiti discontinui di Galerkin (dG). Nell’ambito di queste metodologie, il contributo della tesi e' triplice. Innanzi- tutto, il lavoro introduce un algoritmo di parallelizzazione ibrida MPI/OpenMP per l’utilizzo efficiente di risorse di super calcolo. In secondo luogo, propone strategie di soluzione efficienti, scalabili e con limitata allocazione di memoria per la soluzione di problemi complessi. Infine, confronta le strategie di soluzione introdotte con nuove tecniche di discretizzazione dette “ibridizzabili”, su problemi riguardanti la soluzione delle equazioni di Navier–Stokes non stazionarie. L’efficienza computazionale e' stata valutata su casi di crescente complessita' riguardanti la simulazione della turbolenza. In primo luogo, e' stata considerata la convezione naturale di Rayleigh-Benard e il flusso turbolento in un canale a numeri di Reynolds moderatamente alti. Le strategie di soluzione proposte sono risultate fino a cinque volte piu` veloci rispetto ai metodi standard allocando solamente il 7% della memoria. In secondo luogo, e' stato analizzato il flusso attorno ad una piastra piana con bordo arrotondato sottoposta a diversi livelli di turbolenza in ingresso. Nonostante la maggiore complessità' dovuta all’uso di elementi curvi ed anisotropi, l’algoritmo proposto e' risultato oltre tre volte piu` veloce allocando il 15% della memoria rispetto ad un metodo standard. Concludendo, viene riportata la simulazione del “Boeing Rudimentary Landing Gear” a Re = 10^6. In tutti i casi i risultati ottenuti sono in ottimo accordo con i dati sperimentali e con precedenti simulazioni numeriche pubblicate in letteratura.In recent years the increasing availability of High Performance Computing (HPC) resources strongly promoted the widespread of high fidelity simulations, such as the Large Eddy Simulation (LES), for industrial research and design. One of the most promising approaches to those kind of simulations is based on the discontinuous Galerkin (dG) discretization method. The contribution of the thesis towards this research area is three-fold. First, the work introduces an efficient hybrid MPI/OpenMP parallelisation paradigm to fruitfully exploit large HPC facilities. Second, it reports efficient, scalable and memory saving solution strategies for stiff dG discretisations. Third, it compares those solution strategies, for the first time using the same numerical framework, to hybridizable discontinuous Galerkin (HDG) methods, including a novel implementation of a p-multigrid preconditioning approach, on unsteady flow problems involving the solution of the NavierStokes equations. The improvements in computational efficiency have been evaluated on cases of growing complexity involving large eddy simulations of turbulent flows. First, the Rayleigh-Benard convection problem and the turbulent channel flow at moderately high Reynolds numbers is presented. The solution strategies proposed resulted up to five times faster than standard matrix-based methods while al- locating the 7% of the memory. A second family of test cases involve the LES simulation of a rounded leading edge flat plate under different levels of free-stream turbulence. Although the increased stiffness of the iteration matrix due to the use of curved and stretched elements, the solver resulted more than three times faster while allocating the 15% of the memory if compared to standard methods. Finally, the large eddy simulation of the Boeing Rudimentary Landing Gear at Re = 10^6 is reported. In all the cases, a remarkable agreement with experimental data as well as previous numerical simulations is documented.INGEGNERIA INDUSTRIALEopenFranciolini, Matte

IRIS UniversitÃ Politecnica delle Marche

Parallelization of a DEM/CFD code for the numerical simulation of particle-laden turbulent flows

Author: Oliva Llena Asensio
Sheng Y.
Tan Y.Q.
Trias Miquel Francesc Xavier
Zhang Hao
Publication venue
Publication date: 01/01/2011
Field of study

The interaction between a turbulent fluid flow and particle motion is investigated numerically. A complete direct numerical simulation (DNS) is carried out to solve the governing equations of the fluid phase, to investigate the behavior of inter-particle collision and its effects on particle dispersion, the discrete element method (DEM) is employed to calculate the particle motion.The parallelization strategy of the DNS part is based on a domain decomposition method and uses a hybrid MPI+OpenMP approach. On the other hand, the OpenMP is used for the parallelization of DEM: the total number of particles to be tracked are equally distributed among processors. Finally, the method is tested for a turbulent flow through a square duct.Peer ReviewedPostprint (author’s final draft

UPCommons. Portal del coneixement obert de la UPC

CUDA Implementation of a Navier-Stokes Solver on Multi-GPU Desktop Platforms for Incompressible Flows

Author: Bleiweiss A.
Buck I.
Fan Z.
Hennessy J. L.
Li W.
Liu Y.
Molemaker J.
Ryoo S.
Schatz M. C.
Stratton J. A.
The MPI
Ufimtsev I.
Publication venue: 'IUScholarWorks'
Publication date: 01/01/2009
Field of study

Graphics processor units (GPU) that are traditionally designed for graphics rendering have emerged as massively-parallel co-processors to the central processing unit (CPU). Small-footprint desktop supercomputers with hundreds of cores that can deliver teraflops peak performance at the price of conventional workstations have been realized. A computational fluid dynamics (CFD) simulation capability with rapid computational turnaround time has the potential to transform engineering analysis and design optimization procedures. We describe the implementation of a Navier-Stokes solver for incompressible fluid flow using desktop platforms equipped with multi-GPUs. Specifically, NVIDIA’s Compute Unified Device Architecture (CUDA) programming model is used to implement the discretized form of the governing equations. The projection algorithm to solve the incompressible fluid flow equations is divided into distinct CUDA kernels, and a unique implementation that exploits the memory hierarchy of the CUDA programming model is suggested. Using a quad-GPU platform, we observe two orders of magnitude speedup relative to a serial CPU implementation. Our results demonstrate that multi-GPU desktops can serve as a cost-effective small-footprint parallel computing platform to accelerate CFD simulations substantially. I. Introductio

Crossref

Boise State University - ScholarWorks

Multi-Level Parallelism for Incompressible Flow Computations on GPU Clusters

Author: Jacobsen Dana A.
Senocak Inanc
Publication venue: 'IUScholarWorks'
Publication date: 01/01/2013
Field of study

We investigate multi-level parallelism on GPU clusters with MPI-CUDA and hybrid MPI-OpenMP-CUDA parallel implementations, in which all computations are done on the GPU using CUDA. We explore efficiency and scalability of incompressible flow computations using up to 256 GPUs on a problem with approximately 17.2 billion cells. Our work addresses some of the unique issues faced when merging fine-grain parallelism on the GPU using CUDA with coarse-grain parallelism that use either MPI or MPI-OpenMP for communications. We present three different strategies to overlap computations with communications, and systematically assess their impact on parallel performance on two different GPU clusters. Our results for strong and weak scaling analysis of incompressible flow computations demonstrate that GPU clusters offer significant benefits for large data sets, and a dual-level MPI-CUDA implementation with maximum overlapping of computation and communication provides substantial benefits in performance. We also find that our tri-level MPI-OpenMP-CUDA parallel implementation does not offer a significant advantage in performance over the dual-level implementation on GPU clusters with two GPUs per node, but on clusters with higher GPU counts per node or with different domain decomposition strategies a tri-level implementation may exhibit higher efficiency than a dual-level implementation and needs to be investigated further

Boise State University - ScholarWorks

A Comprehensive Three-Dimensional Model of the Cochlea

Author: Allaire
Allen
Allen
Beyer
Bogert
Chadwick
Clark
de Boer
Edward Givelberg
Fletcher
Geisler
Gold
Holmes
Inselberg
Johnstone
Julian Bunn
Kemp
Khanna
Kim
Kohllöffel
Kolston
Kolston
Lesser
Leveque
Loh
Manoussaki
Nuttall
Parthasarati
Peskin
Peskin
Peskin
Peterson
Ramamoorthy
Ranke
Rhode
Robles
Schroeder
Sellick
Siebert
Steele
Steele
Steele
Viergever
von Békésy
Zweig
Zwislocki
Zwislocki
Publication venue: 'Elsevier BV'
Publication date: 01/01/2003
Field of study

The human cochlea is a remarkable device, able to discern extremely small amplitude sound pressure waves, and discriminate between very close frequencies. Simulation of the cochlea is computationally challenging due to its complex geometry, intricate construction and small physical size. We have developed, and are continuing to refine, a detailed three-dimensional computational model based on an accurate cochlear geometry obtained from physical measurements. In the model, the immersed boundary method is used to calculate the fluid-structure interactions produced in response to incoming sound waves. The model includes a detailed and realistic description of the various elastic structures present. In this paper, we describe the computational model and its performance on the latest generation of shared memory servers from Hewlett Packard. Using compiler generated threads and OpenMP directives, we have achieved a high degree of parallelism in the executable, which has made possible several large scale numerical simulation experiments that study the interesting features of the cochlear system. We show several results from these simulations, reproducing some of the basic known characteristics of cochlear mechanics.Comment: 22 pages, 5 figure

arXiv.org e-Print Archive

Crossref

Caltech Authors

Parallel Lagrangian particle transport : application to respiratory system airways

Author: Olivares Mañas Edgar
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2018
Field of study

This thesis is focused on particle transport in the context of high computing performance (HPC) in its widest range, from the numerical modeling to the physics involved, including its parallelization and post-process. The main goal is to obtain a general framework that enables understanding all the requirements and characteristics of particle transport using the Lagrangian frame of reference. Although the idea is to provide a suitable model for any engineering application that involves particle transport simulation, this thesis uses the respiratory system framework. This means that all the simulations are focused on this topic, including the benchmarks for testing, verifying and optimizing the results. Other applications, such as combustion, ocean residuals, or automotive, have also been simulated by other researchers using the same numerical model proposed here. However, they have not been included here in the interest of allowing the project to advance in a specific direction, and facilitate the structure and comprehension of this work. Human airways and respiratory system simulations are of special interest for medical purposes. Indeed, human airways can be significantly different in every individual. This complicates the study of drug delivery efficiency, deposition of polluted particles, etc., using classic in-vivo or in-vitro techniques. In other words, flow and deposition results may vary depending on the geometry of the patient and simulations allow customized studies using specific geometries. With the help of the new computational techniques, in the near future it may be possible to optimize nasal drugs delivery, surgery or other medical studies for each individual patient though a more personalized medicine. In summary, this thesis prioritizes numerical modeling, wide usability, performance, parallelization, and the study of the physics that affects particle transport. In addition, the simulation of the respiratory system should carry out interesting biological and medical results. However, the interpretation of these results will be only done from a pure numerical point of view.Aquesta tesi se centra en el transport de partícules dins el context de la computació d'alt rendiment (HPC), en el seu ventall més ampli; des del model numèric fins a la física involucrada, incloent-hi la part de paral·lelització del codi i de post-procés. L'objectiu principal és obtenir un esquema general que permeti entendre tant els requeriments com les característiques del transport de partícules fent servir el marc de referència Lagrangià. Encara que la idea sigui definir un model capaç¸ de simular qualsevol aplicació en el camp de l'enginyeria que involucri el transport de partícules, aquesta tesi utilitza el sistema respiratori com a temàtica de referència. Això significa que totes les simulacions estan emmarcades en aquest camp d'estudi, incloent-hi els tests de referència, verificacions i optimitzacions de resultats. L'estudi d'altres aplicacions, com ara la combustió, els residus oceànics, l'automoció o l'aeronàutica també han estat dutes a terme per altres investigadors utilitzant el mateix model numèric proposat aquí. Tot i així, aquests resultats no han estat inclosos en aquesta tesi per simplificar-la i avançar en una sola direcció; facilitant així l'estructura i millor comprensió d'aquest treball. Pel que fa al sistema respiratori humà i les seves simulacions, tenen especial interès per a propòsits mèdics. Particularment, la geometria dels conductes respiratoris pot variar de manera considerable en cada persona. Això complica l'estudi en aspectes com el subministrament de medicaments o la deposició de partícules contaminants, per exemple, utilitzant les tècniques clàssiques de laboratori (in-vivo o in-vitro). En altres paraules, tant el flux com la deposició poden canviar en funció de la geometria del pacient i aquí és on les simulacions permeten estudis adaptats a geometries concretes. Gràcies a les noves tècniques de computació, en un futur proper és probable que puguem optimitzar el subministrament de medicaments per via nasal, la cirurgia o altres estudis mèdics per a cada pacient mitjançant una medicina més personalitzada. En resum, aquesta tesi prioritza el model numèric, l'amplitud d'usos, el rendiment, la paral·lelització i l'estudi de la física que afecta directament a les partícules. A més, el fet de basar les nostres simulacions en el sistema respiratori dota aquesta tesi d'un interès biològic i mèdic pel que fa als resultats

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa