117 research outputs found

    Parallel algorithms for computational fluid dynamics on unstructured meshes

    Get PDF
    La simulació numèrica directa (DNS) de fluxos complexes és actualment una utopia per la majoria d'aplicacions industrials ja que els requeriments computacionals son massa elevats. Donat un flux, la diferència entre els recursos computacionals necessaris i els disponibles és cobreix mitjançant la modelització/simplificació d'alguns termes de les equacions originals que regeixen el seu comportament. El creixement continuat dels recursos computacionals disponibles, principalment en forma de super-ordinadors, contribueix a reduir la part del flux que és necessari aproximar. De totes maneres, obtenir la eficiència esperada dels nous super-ordinadors no és una tasca senzilla i, per aquest motiu, part de la recerca en el camp de la Mecànica de Fluids Computacional es centra en aquest objectiu. En aquest sentit, algunes contribucions s'han presentat en el marc d'aquesta tesis. El primer objectiu va ser el desenvolupament d'un codi de CFD de propòsit general i paral·lel, basat en la metodologia de volums finits en malles no estructurades, per resoldre problemes de multi-física. Aquest codi, anomenat TermoFluids (TF), té un disseny orientat a objectes i pensat per ser usat de forma altament eficient en els super-ordinadors actuals. Amb el temps, ha esdevingut pel grup una eina fonamental en projectes tant de recerca bàsica com d'interès industrial. En el context d'aquesta tesis, el treball s'ha focalitzat en el desenvolupament de dos de les llibreries més bàsiques de TermoFluids: i) La Basics Objects Library (BOL), que es una plataforma de software sobre la qual estan programades la resta de llibreries del codi, i que conté els mètodes algebraics i geomètrics fonamentals per la implementació paral·lela dels algoritmes de discretització, ii) la Linear Solvers Library (LSL), que conté un gran nombre de mètodes per resoldre els sistemes d'equacions lineals derivats de les discretitzacions. El primer capítol d'aquesta tesi conté les principals idees subjacents al disseny i la implementació de la BOL i la LSL, juntament amb alguns exemples i algunes aplicacions industrials. En els capítols posteriors hi ha una explicació detallada de solvers específics per algunes aplicacions concretes. En el segon capítol, es presenta un solver paral·lel i directe per la resolució de l'equació de Poisson per casos en els quals una de les direccions del domini té condicions d'homogeneïtat. En la simulació de fluxos incompressibles, l'equació de Poisson es resol almenys una vegada en cada pas de temps, convertint-se en una de les parts més costoses i difícils de paral·lelitzar del codi. El mètode que proposem és una combinació d'una descomposició directa de Schur (DDS) i una diagonalització de Fourier. La darrera descompon el sistema original en un conjunt de sub-sistemes 2D independents que es resolen mitjançant l'algorisme DDS. Atès que no s'imposen restriccions a les direccions no periòdiques del domini, aquest mètode és aplicable a la resolució de problemes discretitzats mitjançat l'extrusió de malles 2D no estructurades. L'escalabilitat d'aquest mètode ha estat provada amb èxit amb un màxim de 8192 CPU per malles de fins a ~10⁹ volums de control. En el darrer capitol capítol, es presenta un mètode de resolució per l'equació de Transport de Boltzmann (BTE). La estratègia emprada es basa en el mètode d'Ordenades Discretes i pot ser aplicat en discretitzacions no estructurades. El flux per a cada ordenada angular es resol amb un mètode de substitució equivalent a la resolució d'un sistema lineal triangular. La naturalesa seqüencial d'aquest procés fa de la paral·lelització de l'algoritme el principal repte. Diversos algorismes de substitució han estat analitzats, esdevenint una de les heurístiques proposades la millor opció en totes les situacions analitzades, amb excel·lents resultats. Els testos d'eficiència paral·lela s'han realitzat usant fins a 2560 CPU.Direct Numerical Simulation (DNS) of complex flows is currently an utopia for most of industrial applications because computational requirements are too high. For a given flow, the gap between the required and the available computing resources is covered by modeling/simplifying of some terms of the original equations. On the other hand, the continuous growth of the computing power of modern supercomputers contributes to reduce this gap, reducing hence the unresolved physics that need to be attempted with approximated models. This growth, widely relies on parallel computing technologies. However, getting the expected performance from new complex computing systems is becoming more and more difficult, and therefore part of the CFD research is focused on this goal. Regarding to it, some contributions are presented in this thesis. The first objective was to contribute to the development of a general purpose multi-physics CFD code. referred to as TermoFluids (TF). TF is programmed following the object oriented paradigm and designed to run in modern parallel computing systems. It is also intensively involved in many different projects ranging from basic research to industry applications. Besides, one of the strengths of TF is its good parallel performance demonstrated in several supercomputers. In the context of this thesis, the work was focused on the development of two of the most basic libraries that compose TF: I) the Basic Objects Library (BOL), which is a parallel unstructured CFD application programming interface, on the top of which the rest of libraries that compose TF are written, ii) the Linear Solvers Library (LSL) containing many different algorithms to solve the linear systems arising from the discretization of the equations. The first chapter of this thesis contains the main ideas underlying the design and the implementation of the BOL and LSL libraries, together with some examples and some industrial applications. A detailed description of some application-specific linear solvers included in the LSL is carried out in the following chapters. In the second chapter, a parallel direct Poisson solver restricted to problems with one uniform periodic direction is presented. The Poisson equation is solved, at least, once per time-step when modeling incompressible flows, becoming one of the most time consuming and difficult to parallelize parts of the code. The solver here proposed is a combination of a direct Schur-complement based decomposition (DSD) and a Fourier diagonalization. The latter decomposes the original system into a set of mutually independent 2D sub-systems which are solved by means of the DSD algorithm. Since no restrictions are imposed in the non-periodic directions, the overall algorithm is well-suited for solving problems discretized on extruded 2D unstructured meshes. The scalability of the solver has been successfully tested using up to 8192 CPU cores for meshes with up to 10 9 grid points. In the last chapter, a solver for the Boltzmann Transport Equation (BTE) is presented. It can be used to solve radiation phenomena interacting with flows. The solver is based on the Discrete Ordinates Method and can be applied to unstructured discretizations. The flux for each angular ordinate is swept across the computational grid, within a source iteration loop that accounts for the coupling between the different ordinates. The sequential nature of the sweep process makes the parallelization of the overall algorithm the most challenging aspect. Several parallel sweep algorithms, which represent different options of interleaving communications and calculations, are analyzed. One of the heuristics proposed consistently stands out as the best option in all the situations analyzed. With this algorithm, good scalability results have been achieved regarding both weak and strong speedup tests with up to 2560 CPUs

    Das unstetige Galerkinverfahren für Strömungen mit freier Oberfläche und im Grundwasserbereich in geophysikalischen Anwendungen

    Get PDF
    Free surface flows and subsurface flows appear in a broad range of geophysical applications and in many environmental settings situations arise which even require the coupling of free surface and subsurface flows. Many of these application scenarios are characterized by large domain sizes and long simulation times. Hence, they need considerable amounts of computational work to achieve accurate solutions and the use of efficient algorithms and high performance computing resources to obtain results within a reasonable time frame is mandatory. Discontinuous Galerkin methods are a class of numerical methods for solving differential equations that share characteristics with methods from the finite volume and finite element frameworks. They feature high approximation orders, offer a large degree of flexibility, and are well-suited for parallel computing. This thesis consists of eight articles and an extended summary that describe the application of discontinuous Galerkin methods to mathematical models including free surface and subsurface flow scenarios with a strong focus on computational aspects. It covers discretization and implementation aspects, the parallelization of the method, and discrete stability analysis of the coupled model.Für viele geophysikalische Anwendungen spielen Strömungen mit freier Oberfläche und im Grundwasserbereich oder sogar die Kopplung dieser beiden eine zentrale Rolle. Oftmals charakteristisch für diese Anwendungsszenarien sind große Rechengebiete und lange Simulationszeiten. Folglich ist das Berechnen akkurater Lösungen mit beträchtlichem Rechenaufwand verbunden und der Einsatz effizienter Lösungsverfahren sowie von Techniken des Hochleistungsrechnens obligatorisch, um Ergebnisse innerhalb eines annehmbaren Zeitrahmens zu erhalten. Unstetige Galerkinverfahren stellen eine Gruppe numerischer Verfahren zum Lösen von Differentialgleichungen dar, und kombinieren Eigenschaften von Methoden der Finiten Volumen- und Finiten Elementeverfahren. Sie ermöglichen hohe Approximationsordnungen, bieten einen hohen Grad an Flexibilität und sind für paralleles Rechnen gut geeignet. Diese Dissertation besteht aus acht Artikeln und einer erweiterten Zusammenfassung, in diesen die Anwendung unstetiger Galerkinverfahren auf mathematische Modelle inklusive solcher für Strömungen mit freier Oberfläche und im Grundwasserbereich beschrieben wird. Die behandelten Themen umfassen Diskretisierungs- und Implementierungsaspekte, die Parallelisierung der Methode sowie eine diskrete Stabilitätsanalyse des gekoppelten Modells

    HPCCP/CAS Workshop Proceedings 1998

    Get PDF
    This publication is a collection of extended abstracts of presentations given at the HPCCP/CAS (High Performance Computing and Communications Program/Computational Aerosciences Project) Workshop held on August 24-26, 1998, at NASA Ames Research Center, Moffett Field, California. The objective of the Workshop was to bring together the aerospace high performance computing community, consisting of airframe and propulsion companies, independent software vendors, university researchers, and government scientists and engineers. The Workshop was sponsored by the HPCCP Office at NASA Ames Research Center. The Workshop consisted of over 40 presentations, including an overview of NASA's High Performance Computing and Communications Program and the Computational Aerosciences Project; ten sessions of papers representative of the high performance computing research conducted within the Program by the aerospace industry, academia, NASA, and other government laboratories; two panel sessions; and a special presentation by Mr. James Bailey

    A New Spherical Harmonics Scheme for Multi-Dimensional Radiation Transport I: Static Matter Configurations

    Get PDF
    Recent work by McClarren & Hauck [29] suggests that the filtered spherical harmonics method represents an efficient, robust, and accurate method for radiation transport, at least in the two-dimensional (2D) case. We extend their work to the three-dimensional (3D) case and find that all of the advantages of the filtering approach identified in 2D are present also in the 3D case. We reformulate the filter operation in a way that is independent of the timestep and of the spatial discretization. We also explore different second- and fourth-order filters and find that the second-order ones yield significantly better results. Overall, our findings suggest that the filtered spherical harmonics approach represents a very promising method for 3D radiation transport calculations.Comment: 29 pages, 13 figures. Version matching the one in Journal of Computational Physic

    Towards Efficient and Scalable Discontinuous Galerkin Methods for Unsteady Flows

    Get PDF
    openNegli ultimi anni, la crescente disponibilit`a di risorse computazionali ha contribuito alla diffusione della fluidodinamica computazionale per la ricerca e per la progettazione industriale. Uno degli approcci pi promettenti si basa sul metodo agli elementi finiti discontinui di Galerkin (dG). Nell’ambito di queste metodologie, il contributo della tesi e' triplice. Innanzi- tutto, il lavoro introduce un algoritmo di parallelizzazione ibrida MPI/OpenMP per l’utilizzo efficiente di risorse di super calcolo. In secondo luogo, propone strategie di soluzione efficienti, scalabili e con limitata allocazione di memoria per la soluzione di problemi complessi. Infine, confronta le strategie di soluzione introdotte con nuove tecniche di discretizzazione dette “ibridizzabili”, su problemi riguardanti la soluzione delle equazioni di Navier–Stokes non stazionarie. L’efficienza computazionale e' stata valutata su casi di crescente complessita' riguardanti la simulazione della turbolenza. In primo luogo, e' stata considerata la convezione naturale di Rayleigh-Benard e il flusso turbolento in un canale a numeri di Reynolds moderatamente alti. Le strategie di soluzione proposte sono risultate fino a cinque volte piu` veloci rispetto ai metodi standard allocando solamente il 7% della memoria. In secondo luogo, e' stato analizzato il flusso attorno ad una piastra piana con bordo arrotondato sottoposta a diversi livelli di turbolenza in ingresso. Nonostante la maggiore complessità' dovuta all’uso di elementi curvi ed anisotropi, l’algoritmo proposto e' risultato oltre tre volte piu` veloce allocando il 15% della memoria rispetto ad un metodo standard. Concludendo, viene riportata la simulazione del “Boeing Rudimentary Landing Gear” a Re = 10^6. In tutti i casi i risultati ottenuti sono in ottimo accordo con i dati sperimentali e con precedenti simulazioni numeriche pubblicate in letteratura.In recent years the increasing availability of High Performance Computing (HPC) resources strongly promoted the widespread of high fidelity simulations, such as the Large Eddy Simulation (LES), for industrial research and design. One of the most promising approaches to those kind of simulations is based on the discontinuous Galerkin (dG) discretization method. The contribution of the thesis towards this research area is three-fold. First, the work introduces an efficient hybrid MPI/OpenMP parallelisation paradigm to fruitfully exploit large HPC facilities. Second, it reports efficient, scalable and memory saving solution strategies for stiff dG discretisations. Third, it compares those solution strategies, for the first time using the same numerical framework, to hybridizable discontinuous Galerkin (HDG) methods, including a novel implementation of a p-multigrid preconditioning approach, on unsteady flow problems involving the solution of the NavierStokes equations. The improvements in computational efficiency have been evaluated on cases of growing complexity involving large eddy simulations of turbulent flows. First, the Rayleigh-Benard convection problem and the turbulent channel flow at moderately high Reynolds numbers is presented. The solution strategies proposed resulted up to five times faster than standard matrix-based methods while al- locating the 7% of the memory. A second family of test cases involve the LES simulation of a rounded leading edge flat plate under different levels of free-stream turbulence. Although the increased stiffness of the iteration matrix due to the use of curved and stretched elements, the solver resulted more than three times faster while allocating the 15% of the memory if compared to standard methods. Finally, the large eddy simulation of the Boeing Rudimentary Landing Gear at Re = 10^6 is reported. In all the cases, a remarkable agreement with experimental data as well as previous numerical simulations is documented.INGEGNERIA INDUSTRIALEopenFranciolini, Matte

    ICON-O: The Ocean Component of the ICON Earth System Model - Global simulation characteristics and local telescoping capability

    Get PDF
    Abstract We describe the ocean general circulation model ICON-O of the Max Planck Institute for Meteorology, which forms the ocean-sea ice component of the Earth system model ICON-ESM. ICON-O relies on innovative structure-preserving finite volume numerics. We demonstrate the fundamental ability of ICON-O to simulate key features of global ocean dynamics at both uniform and non-uniform resolution. Two experiments are analyzed and compared with observations, one with a nearly uniform and eddy-rich resolution of ?10?km and another with a telescoping configuration whose resolution varies smoothly from globally ?80?km to ?10?km in a focal region in the North Atlantic. Our results show first, that ICON-O on the nearly uniform grid simulates an ocean circulation that compares well with observations and second, that ICON-O in its telescope configuration is capable of reproducing the dynamics in the focal region over decadal time scales at a fraction of the computational cost of the uniform-grid simulation. The telescopic technique offers an alternative to the established regionalization approaches. It can be used either to resolve local circulation more accurately or to represent local scales that cannot be simulated globally while remaining within a global modeling framework

    AIMES: advanced computation and I/O methods for earth-system simulations

    Get PDF
    Dealing with extreme scale Earth-system models is challenging from the computer science perspective, as the required computing power and storage capacity are steadily increasing. Scientists perform runs with growing resolution or aggregate results from many similar smaller-scale runs with slightly different initial conditions (the so-called ensemble runs). In the fifth Coupled Model Intercomparison Project (CMIP5), the produced datasets require more than three Petabytes of storage and the compute and storage requirements are increasing significantly for CMIP6. Climate scientists across the globe are developing next-generation models based on improved numerical formulation leading to grids that are discretized in alternative forms such as an icosahedral (geodesic) grid. The developers of these models face similar problems in scaling, maintaining and optimizing code. Performance portability and the maintainability of code are key concerns of scientists as, compared to industry projects, model code is continuously revised and extended to incorporate further levels of detail. This leads to a rapidly growing code base that is rarely refactored. However, code modernization is important to maintain productivity of the scientist working with the code and for utilizing performance provided by modern and future architectures. The need for performance optimization is motivated by the evolution of the parallel architecture landscape from homogeneous flat machines to heterogeneous combinations of processors with deep memory hierarchy. Notably, the rise of many-core, throughput-oriented accelerators, such as GPUs, requires non-trivial code changes at minimum and, even worse, may necessitate a substantial rewrite of the existing codebase. At the same time, the code complexity increases the difficulty for computer scientists and vendors to understand and optimize the code for a given system. Storing the products of climate predictions requires a large storage and archival system which is expensive. Often, scientists restrict the number of scientific variables and write interval to keep the costs balanced. Compression algorithms can reduce the costs significantly but can also increase the scientific yield of simulation runs. In the AIMES project, we addressed the key issues of programmability, computational efficiency and I/O limitations that are common in next-generation icosahedral earth-system models. The project focused on the separation of concerns between domain scientist, computational scientists, and computer scientists

    Tools and Selected Applications

    Get PDF

    ISCR Annual Report: Fical Year 2004

    Full text link
    corecore