Search CORE

301 research outputs found

Discrete adjoints on many cores Algorithmic differentiation of accelerated fluid simulations

Author: Hückelheim Jan Christian
Publication venue: 'Queen Mary University of London'
Publication date: 29/06/2017
Field of study

PhDSimulations are used in science and industry to predict the performance of technical systems. Adjoint derivatives of these simulations can reveal the sensitivity of the system performance to changes in design or operating conditions, and are increasingly used in shape optimisation and uncertainty quantification. Algorithmic differentiation (AD) by source-transformation is an efficient method to compute such derivatives. AD requires an analysis of the computation and its data flow to produce efficient adjoint code. One important step is the activity analysis that detects operations that need to be differentiated. An improved activity analysis is investigated in this thesis that simplifies build procedures for certain adjoint programs, and is demonstrated to improve the speed of an adjoint fluid dynamics solver. The method works by allowing a context-dependent analysis of routines. The ongoing trend towards multi- and many-core architectures such as the Intel XeonPhi is creating challenges for AD. Two novel approaches are presented that replicate the parallelisation of a program in its corresponding adjoint program. The first approach detects loops that naturally result in a parallelisable adjoint loop, while the second approach uses loop transformation and the aforementioned context-dependent analysis to enforce parallelisable data access in the adjoint loop. A case study shows that both approaches yield adjoints that are as scalable as their underlying primal programs. Adjoint computations are limited by their memory footprint, particularly in unsteady simulations, for which this work presents incomplete checkpointing as a method to reduce memory usage at the cost of a slight reduction in accuracy. Finally, convergence of iterative linear solvers is discussed, which is especially relevant on accelerator cards, where single precision floating point numbers are frequently used and the choice of solvers is limited by the small memory size. Some problems that are particular to adjoint computations are discussed.European Union

Queen Mary Research Online

A Hybrid MPI-OpenMP Parallel Implementation for pseudospectral simulations with application to Taylor-Couette Flow

Author: Avila Marc
Hof Bjoern
Rampp Markus
Shi Liang
Publication venue: 'Elsevier BV'
Publication date: 07/07/2014
Field of study

A hybrid-parallel direct-numerical-simulation method with application to turbulent Taylor-Couette flow is presented. The Navier-Stokes equations are discretized in cylindrical coordinates with the spectral Fourier-Galerkin method in the axial and azimuthal directions, and high-order finite differences in the radial direction. Time is advanced by a second-order, semi-implicit projection scheme, which requires the solution of five Helmholtz/Poisson equations, avoids staggered grids and renders very small slip velocities. Nonlinear terms are computed with the pseudospectral method. The code is parallelized using a hybrid MPI-OpenMP strategy, which is simpler to implement, reduces inter-node communications and is more efficient compared to a flat MPI parallelization. A strong scaling study shows that the hybrid code maintains very good scalability up to more than 20000 processor cores and thus allows to perform simulations at higher resolutions than previously feasible, and opens up the possibility to simulate turbulent Taylor-Couette flows at Reynolds numbers up to

\mathcal{O}(10^5)

. This enables to probe hydrodynamic turbulence in Keplerian flows in experimentally relevant regimes.Comment: 30 pages, 11 figure

arXiv.org e-Print Archive

IST Austria: PubRep (Institute of Science and Technology)

MPG.PuRe

Adjoint computations by algorithmic differentiation of a parallel solver for time-dependent PDEs

Author: Airiau C.
Cardesa J. I.
Hascoët L.
Publication venue: 'Elsevier BV'
Publication date: 12/05/2020
Field of study

A computational fluid dynamics code is differentiated using algorithmic differentiation (AD) in both tangent and adjoint modes. The two novelties of the present approach are 1) the adjoint code is obtained by letting the AD tool Tapenade invert the complete layer of message passing interface (MPI) communications, and 2) the adjoint code integrates time-dependent, non-linear and dissipative (hence physically irreversible) PDEs with an explicit time integration loop running for ca.

10^{6}

time steps. The approach relies on using the Adjoinable MPI library to reverse the non-blocking communication patterns in the original code, and by controlling the memory overhead induced by the time-stepping loop with binomial checkpointing. A description of the necessary code modifications is provided along with the validation of the computed derivatives and a performance comparison of the tangent and adjoint codes.Comment: Submitted to Journal of Computational Scienc

arXiv.org e-Print Archive

Open Archive Toulouse Archive Ouverte

INRIA a CCSD electronic archive server

Recommended from our members

Global convection in Earth's mantle : advanced numerical methods and extreme-scale simulations

Author: Rudi Johann
Publication venue
Publication date: 03/04/2019
Field of study

The thermal convection of rock in Earth's mantle and associated plate tectonics are modeled by nonlinear incompressible Stokes and energy equations. This dissertation focuses on the development of advanced, scalable linear and nonlinear solvers for numerical simulations of realistic instantaneous mantle flow, where we must overcome several computational challenges. The most notable challenges are the severe nonlinearity, heterogeneity, and anisotropy due to the mantle's rheology as well as a wide range of spatial scales and highly localized features. Resolving the crucial small scale features efficiently necessitates adaptive methods, while computational results greatly benefit from a high accuracy per degree of freedom and local mass conservation. Consequently, the discretization of Earth's mantle is carried out by high-order finite elements on aggressively adaptively refined hexahedral meshes with a continuous, nodal velocity approximation and a discontinuous, modal pressure approximation. These velocity--pressure pairings yield optimal asymptotic convergence rates of the finite element approximation to the infinite-dimensional solution with decreasing mesh element size, are inf-sup stable on general, non-conforming hexahedral meshes with "hanging nodes,'' and have the advantage of preserving mass locally at the element level due to the discontinuous pressure. However, because of the difficulties cited above and the desired accuracy, the large implicit systems to be solved are extremely poorly conditioned and sophisticated linear and nonlinear solvers including powerful preconditioning techniques are required. The nonlinear Stokes system is solved using a grid continuation, inexact Newton--Krylov method. We measure the residual of the momentum equation in the H⁻¹-norm for backtracking line search to avoid overly conservative update steps that are significantly reduced from one. The Newton linearization is augmented by a perturbation of a highly nonlinear term in mantle's rheology, resulting in dramatically improved nonlinear convergence. We present a new Schur complement-based Stokes preconditioner, weighted BFBT, that exhibits robust fast convergence for Stokes problems with smooth but highly varying (up to 10 orders of magnitude) viscosities, optimal algorithmic scalability with respect to mesh refinement, and only a mild dependence on the polynomial order of high-order finite element discretizations. In addition, we derive theoretical eigenvalue bounds to prove spectral equivalence of our inverse Schur complement approximation. Finally, we present a parallel hybrid spectral--geometric--algebraic multigrid (HMG) to approximate the inverses of the Stokes system's viscous block and variable-coefficient pressure Poisson operators within weighted BFBT. Building on the parallel scalability of HMG, our Stokes solver demonstrates excellent parallel scalability to 1.6 million CPU cores without sacrificing algorithmic optimality.Computational Science, Engineering, and Mathematic

Texas ScholarWorks

A high-performance open-source framework for multiphysics simulation and adjoint-based shape and topology optimization

Author: Carrusca Gomes Pedro
Publication venue: Aeronautics, Imperial College London
Publication date: 01/02/2022
Field of study

The first part of this thesis presents the advances made in the Open-Source software SU2, towards transforming it into a high-performance framework for design and optimization of multiphysics problems. Through this work, and in collaboration with other authors, a tenfold performance improvement was achieved for some problems. More importantly, problems that had previously been impossible to solve in SU2, can now be used in numerical optimization with shape or topology variables. Furthermore, it is now exponentially simpler to study new multiphysics applications, and to develop new numerical schemes taking advantage of modern high-performance-computing systems. In the second part of this thesis, these capabilities allowed the application of topology optimiza- tion to medium scale fluid-structure interaction problems, using high-fidelity models (nonlinear elasticity and Reynolds-averaged Navier-Stokes equations), which had not been done before in the literature. This showed that topology optimization can be used to target aerodynamic objectives, by tailoring the interaction between fluid and structure. However, it also made ev- ident the limitations of density-based methods for this type of problem, in particular, reliably converging to discrete solutions. This was overcome with new strategies to both guarantee and accelerate (i.e. reduce the overall computational cost) the convergence to discrete solutions in fluid-structure interaction problems.Open Acces

Spiral - Imperial College Digital Repository

Computing and Information Science (CIS)

Author: Cornell University
Publication venue: 'SAGE Publications'
Publication date: 01/01/2006
Field of study

Cornell University Courses of Study Vol. 97 2005/200

eCommons@Cornell

Architecture of Advanced Numerical Analysis Systems

Author: Wang Liang
Zhao Jianxin
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

This unique open access book applies the functional OCaml programming language to numerical or computational weighted data science, engineering, and scientific applications. This book is based on the authors' first-hand experience building and maintaining Owl, an OCaml-based numerical computing library. You'll first learn the various components in a modern numerical computation library. Then, you will learn how these components are designed and built up and how to optimize their performance. After reading and using this book, you'll have the knowledge required to design and build real-world complex systems that effectively leverage the advantages of the OCaml functional programming language. What You Will Learn Optimize core operations based on N-dimensional arrays Design and implement an industry-level algorithmic differentiation module Implement mathematical optimization, regression, and deep neural network functionalities based on algorithmic differentiation Design and optimize a computation graph module, and understand the benefits it brings to the numerical computing library Accommodate the growing number of hardware accelerators (e.g. GPU, TPU) and execution backends (e.g. web browser, unikernel) of numerical computation Use the Zoo system for efficient scripting, code sharing, service deployment, and composition Design and implement a distributed computing engine to work with a numerical computing library, providing convenient APIs and high performance Who This Book Is For Those with prior programming experience, especially with the OCaml programming language, or with scientific computing experience who may be new to OCaml. Most importantly, it is for those who are eager to understand not only how to use something, but also how it is built up

OAPEN Library

Resiliency in numerical algorithm design for extreme scale simulations

Author: Agullo Emmanuel
Altenbernd Mirco
Anzt Hartwig
Bautista Gomez Leonardo
Benacchio Tommaso
Publication venue: 'SAGE Publications'
Publication date: 01/12/2021
Field of study

This work is based on the seminar titled ‘Resiliency in Numerical Algorithm Design for Extreme Scale Simulations’ held March 1–6, 2020, at Schloss Dagstuhl, that was attended by all the authors. Advanced supercomputing is characterized by very high computation speeds at the cost of involving an enormous amount of resources and costs. A typical large-scale computation running for 48 h on a system consuming 20 MW, as predicted for exascale systems, would consume a million kWh, corresponding to about 100k Euro in energy cost for executing 1023 floating-point operations. It is clearly unacceptable to lose the whole computation if any of the several million parallel processes fails during the execution. Moreover, if a single operation suffers from a bit-flip error, should the whole computation be declared invalid? What about the notion of reproducibility itself: should this core paradigm of science be revised and refined for results that are obtained by large-scale simulation? Naive versions of conventional resilience techniques will not scale to the exascale regime: with a main memory footprint of tens of Petabytes, synchronously writing checkpoint data all the way to background storage at frequent intervals will create intolerable overheads in runtime and energy consumption. Forecasts show that the mean time between failures could be lower than the time to recover from such a checkpoint, so that large calculations at scale might not make any progress if robust alternatives are not investigated. More advanced resilience techniques must be devised. The key may lie in exploiting both advanced system features as well as specific application knowledge. Research will face two essential questions: (1) what are the reliability requirements for a particular computation and (2) how do we best design the algorithms and software to meet these requirements? While the analysis of use cases can help understand the particular reliability requirements, the construction of remedies is currently wide open. One avenue would be to refine and improve on system- or application-level checkpointing and rollback strategies in the case an error is detected. Developers might use fault notification interfaces and flexible runtime systems to respond to node failures in an application-dependent fashion. Novel numerical algorithms or more stochastic computational approaches may be required to meet accuracy requirements in the face of undetectable soft errors. These ideas constituted an essential topic of the seminar. The goal of this Dagstuhl Seminar was to bring together a diverse group of scientists with expertise in exascale computing to discuss novel ways to make applications resilient against detected and undetected faults. In particular, participants explored the role that algorithms and applications play in the holistic approach needed to tackle this challenge. This article gathers a broad range of perspectives on the role of algorithms, applications and systems in achieving resilience for extreme scale simulations. The ultimate goal is to spark novel ideas and encourage the development of concrete solutions for achieving such resilience holistically.Peer Reviewed"Article signat per 36 autors/es: Emmanuel Agullo, Mirco Altenbernd, Hartwig Anzt, Leonardo Bautista-Gomez, Tommaso Benacchio, Luca Bonaventura, Hans-Joachim Bungartz, Sanjay Chatterjee, Florina M. Ciorba, Nathan DeBardeleben, Daniel Drzisga, Sebastian Eibl, Christian Engelmann, Wilfried N. Gansterer, Luc Giraud, Dominik G ̈oddeke, Marco Heisig, Fabienne Jezequel, Nils Kohl, Xiaoye Sherry Li, Romain Lion, Miriam Mehl, Paul Mycek, Michael Obersteiner, Enrique S. Quintana-Ortiz, Francesco Rizzi, Ulrich Rude, Martin Schulz, Fred Fung, Robert Speck, Linda Stals, Keita Teranishi, Samuel Thibault, Dominik Thonnes, Andreas Wagner and Barbara Wohlmuth"Postprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC