198 research outputs found
Parallel Scalability of Adaptive Mesh Refinement in a Finite Difference Solution to the Shallow Water Equations
The Shallow Water Equations model the fluid dynamics of deep ocean flow, and are used to simulate tides, tsunamis, and storm surges. Numerical solutions using finite difference methods are computationally expensive enough to mandate the use of large computing clusters, and the cost grows not only with the amount of fluid, but also the duration of the simulated event, and the resolution of the approximation. The benefits of increased resolution are mostly connected to regions where complex fluid interactions occur, and are not required globally for the entire simulation. In this paper, we nvestigate the potential for conserving computational resources by applying Adaptive Mesh Refinement to dynamically determined areas of the fluid urface. We implement adaptive mesh refinement in a MacCormack finite difference solver, develop a performance model to predict its behavior on large-scale parallel platforms, and validate its predictions experimentally on two computing clusters. We find that the solver itself has highly favorable parallel scalability, and that the addition of refined areas introduces a performance penalty due to load imbalance that is at most proportional to the refinement degree raised to the third power
REBOUND: An open-source multi-purpose N-body code for collisional dynamics
REBOUND is a new multi-purpose N-body code which is freely available under an
open-source license. It was designed for collisional dynamics such as planetary
rings but can also solve the classical N-body problem. It is highly modular and
can be customized easily to work on a wide variety of different problems in
astrophysics and beyond.
REBOUND comes with three symplectic integrators: leap-frog, the symplectic
epicycle integrator (SEI) and a Wisdom-Holman mapping (WH). It supports open,
periodic and shearing-sheet boundary conditions. REBOUND can use a Barnes-Hut
tree to calculate both self-gravity and collisions. These modules are fully
parallelized with MPI as well as OpenMP. The former makes use of a static
domain decomposition and a distributed essential tree. Two new collision
detection modules based on a plane-sweep algorithm are also implemented. The
performance of the plane-sweep algorithm is superior to a tree code for
simulations in which one dimension is much longer than the other two and in
simulations which are quasi-two dimensional with less than one million
particles.
In this work, we discuss the different algorithms implemented in REBOUND, the
philosophy behind the code's structure as well as implementation specific
details of the different modules. We present results of accuracy and scaling
tests which show that the code can run efficiently on both desktop machines and
large computing clusters.Comment: 10 pages, 9 figures, accepted by A&A, source code available at
https://github.com/hannorein/reboun
Recommended from our members
High-quality dense stereo vision for whole body imaging and obesity assessment
textThe prevalence of obesity has necessitated developing safe and convenient tools for timely assessing and monitoring this condition for a broad range of population. Three-dimensional (3D) body imaging has become a new mean for obesity assessment. Moreover, it generates body shape information that is meaningful for fitness, ergonomics, and personalized clothing. In the previous work of our lab, we developed a prototype active stereo vision system that demonstrated a potential to fulfill this goal. But the prototype required four computer projectors to cast artificial textures on the body which facilitate the stereo-matching on texture-deficient images (e.g., skin). This decreases the mobility of the system when used to collect a large population data. In addition, the resolution of the generated 3D~images is limited by both cameras and projectors available during the project. The study reported in this dissertation highlights our continued effort in improving the capability of 3Dbody imaging through simplified hardware for passive stereo and advanced computation techniques.
The system utilizes high-resolution single-lens reflex (SLR) cameras, which became widely available lately, and is configured in a two-stance design to image the front and back surfaces of a person. A total of eight cameras are used to form four pairs of stereo units. Each unit covers a quarter of the body surface. The stereo units are individually calibrated with a specific pattern to determine cameras' intrinsic and extrinsic parameters for stereo matching. The global orientation and position of each stereo unit within a common world coordinate system is calculated through a 3Dregistration step. The stereo calibration and 3Dregistration procedures do not need to be repeated for a deployed system if the cameras' relative positions have not changed. This property contributes to the portability of the system, and tremendously alleviates the maintenance task. The image acquisition time is around two seconds for a whole-body capture. The system works in an indoor environment with a moderate ambient light.
Advanced stereo computation algorithms are developed by taking advantage of high-resolution images and by tackling the ambiguity problem in stereo matching. A multi-scale, coarse-to-fine matching framework is proposed to match large-scale textures at a low resolution and refine the matched results over higher resolutions. This matching strategy reduces the complexity of the computation and avoids ambiguous matching at the native resolution. The pixel-to-pixel stereo matching algorithm follows a classic, four-step strategy which consists of matching cost computation, cost aggregation, disparity computation and disparity refinement.
The system performance has been evaluated on mannequins and human subjects in comparison with other measurement methods. It was found that the geometrical measurements from reconstructed 3Dbody models, including body circumferences and whole volume, are highly repeatable and consistent with manual and other instrumental measurements (CV 0.99). The agreement of percent body fat (%BF) estimation on human subjects between stereo and dual-energy X-ray absorptiometry (DEXA) was found to be improved over the previous active stereo system, and the limits of agreement with 95% confidence were reduced by half. Our achieved %BF estimation agreement is among the lowest ones of other comparative studies with commercialized air displacement plethysmography (ADP) and DEXA. In practice, %BF estimation through a two-component model is sensitive to body volume measurement, and the estimation of lung volume could be a source of variation. Protocols for this type of measurement should still be created with an awareness of this factor.Biomedical Engineerin
On a general implementation of - and -adaptive curl-conforming finite elements
Edge (or N\'ed\'elec) finite elements are theoretically sound and widely used
by the computational electromagnetics community. However, its implementation,
specially for high order methods, is not trivial, since it involves many
technicalities that are not properly described in the literature. To fill this
gap, we provide a comprehensive description of a general implementation of edge
elements of first kind within the scientific software project FEMPAR. We cover
into detail how to implement arbitrary order (i.e., -adaptive) elements on
hexahedral and tetrahedral meshes. First, we set the three classical
ingredients of the finite element definition by Ciarlet, both in the reference
and the physical space: cell topologies, polynomial spaces and moments. With
these ingredients, shape functions are automatically implemented by defining a
judiciously chosen polynomial pre-basis that spans the local finite element
space combined with a change of basis to automatically obtain a canonical basis
with respect to the moments at hand. Next, we discuss global finite element
spaces putting emphasis on the construction of global shape functions through
oriented meshes, appropriate geometrical mappings, and equivalence classes of
moments, in order to preserve the inter-element continuity of tangential
components of the magnetic field. Finally, we extend the proposed methodology
to generate global curl-conforming spaces on non-conforming hierarchically
refined (i.e., -adaptive) meshes with arbitrary order finite elements.
Numerical results include experimental convergence rates to test the proposed
implementation
Parallel Lagrangian particle transport : application to respiratory system airways
This thesis is focused on particle transport in the context of high computing performance (HPC) in its widest range, from the numerical modeling to the physics involved, including its parallelization and post-process. The main goal is to obtain a general framework that enables understanding all the requirements and characteristics of particle transport using the Lagrangian frame of reference.
Although the idea is to provide a suitable model for any engineering application that involves particle transport simulation, this thesis uses the respiratory system framework. This means that all the simulations are focused on this topic, including the benchmarks for testing, verifying and optimizing the results. Other applications, such as combustion, ocean residuals, or automotive, have also been simulated by other researchers using the same numerical model proposed here. However, they have not been included here in the interest of allowing the project to advance in a specific direction, and facilitate the structure and comprehension of this work.
Human airways and respiratory system simulations are of special interest for medical purposes. Indeed, human airways can be significantly different in every individual. This complicates the study of drug delivery efficiency, deposition of polluted particles, etc., using classic in-vivo or in-vitro techniques. In other words, flow and deposition results may vary depending on the geometry of the patient and simulations allow customized studies using specific geometries. With the help of the new computational techniques, in the near future it may be possible to optimize nasal drugs delivery, surgery or other medical studies for each individual patient though a more personalized medicine.
In summary, this thesis prioritizes numerical modeling, wide usability, performance, parallelization, and the study of the physics that affects particle transport. In addition, the simulation of the respiratory system should carry out interesting biological and medical results. However, the interpretation of these results will be only done from a pure numerical point of view.Aquesta tesi se centra en el transport de partícules dins el context de la computació d'alt rendiment (HPC), en el seu ventall més ampli; des del model numèric fins a la física involucrada, incloent-hi la part de paral·lelització del codi i de post-procés. L'objectiu principal és obtenir un esquema general que permeti entendre tant els requeriments com les característiques del transport de partícules fent servir el marc de referència Lagrangià. Encara que la idea sigui definir un model capaç¸ de simular qualsevol aplicació en el camp de l'enginyeria que involucri el transport de partícules, aquesta tesi utilitza el sistema respiratori com a temàtica de referència. Això significa que totes les simulacions estan emmarcades en aquest camp d'estudi, incloent-hi els tests de referència, verificacions i optimitzacions de resultats. L'estudi d'altres aplicacions, com ara la combustió, els residus oceànics, l'automoció o l'aeronàutica també han estat dutes a terme per altres investigadors utilitzant el mateix model numèric proposat aquí. Tot i així, aquests resultats no han estat inclosos en aquesta tesi per simplificar-la i avançar en una sola direcció; facilitant així l'estructura i millor comprensió d'aquest treball. Pel que fa al sistema respiratori humà i les seves simulacions, tenen especial interès per a propòsits mèdics. Particularment, la geometria dels conductes respiratoris pot variar de manera considerable en cada persona. Això complica l'estudi en aspectes com el subministrament de medicaments o la deposició de partícules contaminants, per exemple, utilitzant les tècniques clàssiques de laboratori (in-vivo o in-vitro). En altres paraules, tant el flux com la deposició poden canviar en funció de la geometria del pacient i aquí és on les simulacions permeten estudis adaptats a geometries concretes. Gràcies a les noves tècniques de computació, en un futur proper és probable que puguem optimitzar el subministrament de medicaments per via nasal, la cirurgia o altres estudis mèdics per a cada pacient mitjançant una medicina més personalitzada. En resum, aquesta tesi prioritza el model numèric, l'amplitud d'usos, el rendiment, la paral·lelització i l'estudi de la física que afecta directament a les partícules. A més, el fet de basar les nostres simulacions en el sistema respiratori dota aquesta tesi d'un interès biològic i mèdic pel que fa als resultats
An extreme-scale implicit solver for complex PDEs: highly heterogeneous flow in earth's mantle
Mantle convection is the fundamental physical process within earth's interior responsible for the thermal and geological evolution of the planet, including plate tectonics. The mantle is modeled as a viscous, incompressible, non-Newtonian fluid. The wide range of spatial scales, extreme variability and anisotropy in material properties, and severely nonlinear rheology have made global mantle convection modeling with realistic parameters prohibitive. Here we present a new implicit solver that exhibits optimal algorithmic performance and is capable of extreme scaling for hard PDE problems, such as mantle convection. To maximize accuracy and minimize runtime, the solver incorporates a number of advances, including aggressive multi-octree adaptivity, mixed continuous-discontinuous discretization, arbitrarily-high-order accuracy, hybrid spectral/geometric/algebraic multigrid, and novel Schur-complement preconditioning. These features present enormous challenges for extreme scalability. We demonstrate that---contrary to conventional wisdom---algorithmically optimal implicit solvers can be designed that scale out to 1.5 million cores for severely nonlinear, ill-conditioned, heterogeneous, and anisotropic PDEs
Towards Efficient and Scalable Discontinuous Galerkin Methods for Unsteady Flows
openNegli ultimi anni, la crescente disponibilit`a di risorse computazionali ha contribuito alla diffusione della fluidodinamica computazionale per la ricerca e per la progettazione industriale. Uno degli approcci pi promettenti si basa sul metodo agli elementi finiti discontinui di Galerkin (dG).
Nell’ambito di queste metodologie, il contributo della tesi e' triplice. Innanzi- tutto, il lavoro introduce un algoritmo di parallelizzazione ibrida MPI/OpenMP per l’utilizzo efficiente di risorse di super calcolo. In secondo luogo, propone strategie di soluzione efficienti, scalabili e con limitata allocazione di memoria per la soluzione di problemi complessi. Infine, confronta le strategie di soluzione introdotte con nuove tecniche di discretizzazione dette “ibridizzabili”, su problemi riguardanti la soluzione delle equazioni di Navier–Stokes non stazionarie.
L’efficienza computazionale e' stata valutata su casi di crescente complessita' riguardanti la simulazione della turbolenza. In primo luogo, e' stata considerata la convezione naturale di Rayleigh-Benard e il flusso turbolento in un canale a numeri di Reynolds moderatamente alti. Le strategie di soluzione proposte sono risultate fino a cinque volte piu` veloci rispetto ai metodi standard allocando solamente il 7% della memoria. In secondo luogo, e' stato analizzato il flusso attorno ad una piastra piana con bordo arrotondato sottoposta a diversi livelli di turbolenza in ingresso. Nonostante la maggiore complessità' dovuta all’uso di elementi curvi ed anisotropi, l’algoritmo proposto e' risultato oltre tre volte piu` veloce allocando il 15% della memoria rispetto ad un metodo standard. Concludendo, viene riportata la simulazione del “Boeing Rudimentary Landing Gear” a Re = 10^6. In tutti i casi i risultati ottenuti sono in ottimo accordo con i dati sperimentali e con precedenti simulazioni numeriche pubblicate in letteratura.In recent years the increasing availability of High Performance Computing (HPC) resources strongly promoted the widespread of high fidelity simulations, such as the Large Eddy Simulation (LES), for industrial research and design. One of the most promising approaches to those kind of simulations is based on the discontinuous Galerkin (dG) discretization method.
The contribution of the thesis towards this research area is three-fold. First, the work introduces an efficient hybrid MPI/OpenMP parallelisation paradigm to fruitfully exploit large HPC facilities. Second, it reports efficient, scalable and memory saving solution strategies for stiff dG discretisations. Third, it compares those solution strategies, for the first time using the same numerical framework, to hybridizable discontinuous Galerkin (HDG) methods, including a novel implementation of a p-multigrid preconditioning approach, on unsteady flow problems involving the solution of the NavierStokes equations.
The improvements in computational efficiency have been evaluated on cases of growing complexity involving large eddy simulations of turbulent flows. First, the Rayleigh-Benard convection problem and the turbulent channel flow at moderately high Reynolds numbers is presented. The solution strategies proposed resulted up to five times faster than standard matrix-based methods while al- locating the 7% of the memory. A second family of test cases involve the LES simulation of a rounded leading edge flat plate under different levels of free-stream turbulence. Although the increased stiffness of the iteration matrix due to the use of curved and stretched elements, the solver resulted more than three times faster while allocating the 15% of the memory if compared to standard methods. Finally, the large eddy simulation of the Boeing Rudimentary Landing Gear at Re = 10^6 is reported. In all the cases, a remarkable agreement with experimental data as well as previous numerical simulations is documented.INGEGNERIA INDUSTRIALEopenFranciolini, Matte
- …