18 research outputs found

    A PETSc parallel-in-time solver based on MGRIT algorithm

    Get PDF
    We address the development of a modular implementation of the MGRIT (MultiGrid-In-Time) algorithm to solve linear and nonlinear systems that arise from the discretization of evolutionary models with a parallel-in-time approach in the context of the PETSc (the Portable, Extensible Toolkit for Scientific computing) library. Our aim is to give the opportunity of predicting the performance gain achievable when using the MGRIT approach instead of the Time Stepping integrator (TS). To this end, we analyze the performance parameters of the algorithm that provide a-priori the best number of processing elements and grid levels to use to address the scaling of MGRIT, regarded as a parallel iterative algorithm proceeding along the time dimensio

    Lecture 12: Recent Advances in Time Integration Methods and How They Can Enable Exascale Simulations

    Get PDF
    To prepare for exascale systems, scientific simulations are growing in physical realism and thus complexity. This increase often results in additional and changing time scales. Time integration methods are critical to efficient solution of these multiphysics systems. Yet, many large-scale applications have not fully embraced modern time integration methods nor efficient software implementations. Hence, achieving temporal accuracy with new and complex simulations has proved challenging. We will overview recent advances in time integration methods, including additive IMEX methods, multirate methods, and parallel-in-time approaches, expected to help realize the potential of exascale systems on multiphysics simulations. Efficient execution of these methods relies, in turn, on efficient algebraic solvers, and we will discuss the relationships between integrators and solvers. In addition, an effective time integration approach is not complete without efficient software, and we will discuss effective software design approaches for time integrators and their uses in application codes. Lastly, examples demonstrating some of these new methods and their implementations will be presented. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. LLNL-ABS- 819501

    A parallel implementation of a diagonalization-based parallel-in-time integrator

    Full text link
    We present and analyze a parallel implementation of a parallel-in-time method based on α\alpha-circulant preconditioned Richardson iterations. While there are a lot of papers exploring this new class of single-level, time-parallel integrators from many perspectives, performance results of actual parallel runs are still missing. This leaves a critical gap, because the efficiency and applicability heavily rely on the actual parallel performance, with only limited guidance from theoretical considerations. Also, many challenges like selecting good parameters, finding suitable communication strategies, and performing a fair comparison to sequential time-stepping methods can be easily missed. In this paper, we first extend the original idea by using a collocation method of arbitrary order, which adds another level of parallelization in time. We derive an adaptive strategy to select a new α\alpha-circulant preconditioner for each iteration during runtime for balancing convergence rates, round-off errors and inexactness in the individual time-steps. After addressing these more theoretical challenges, we present an open-source space- and doubly-time-parallel implementation and evaluate its performance for two different test problems

    Multilevel Algebraic Approach for Performance Analysis of Parallel Algorithms

    Get PDF
    In order to solve a problem in parallel we need to undertake the fundamental step of splitting the computational tasks into parts, i.e. decomposing the problem solving. A whatever decomposition does not necessarily lead to a parallel algorithm with the highest performance. This topic is even more important when complex parallel algorithms must be developed for hybrid or heterogeneous architectures. We present an innovative approach which starts from a problem decomposition into parts (sub-problems). These parts will be regarded as elements of an algebraic structure and will be related to each other according to a suitably defined dependency relationship. The main outcome of such framework is to define a set of block matrices (dependency, decomposition, memory accesses and execution) which simply highlight fundamental characteristics of the corresponding algorithm, such as inherent parallelism and sources of overheads. We provide a mathematical formulation of this approach, and we perform a feasibility analysis for the performance of a parallel algorithm in terms of its time complexity and scalability. We compare our results with standard expressions of speed up, efficiency, overhead, and so on. Finally, we show how the multilevel structure of this framework eases the choice of the abstraction level (both for the problem decomposition and for the algorithm description) in order to determine the granularity of the tasks within the performance analysis. This feature is helpful to better understand the mapping of parallel algorithms on novel hybrid and heterogeneous architectures

    Parallel-in-Time Integration of the Landau-Lifshitz-Gilbert Equation with the Parallel Full Approximation Scheme in Space and Time

    Full text link
    Speeding up computationally expensive problems, such as numerical simulations of large micromagnetic systems, requires efficient use of parallel computing infrastructures. While parallelism across space is commonly exploited in micromagnetics, this strategy performs poorly once a minimum number of degrees of freedom per core is reached. We use magnum.pi, a finite-element micromagnetic simulation software, to investigate the Parallel Full Approximation Scheme in Space and Time (PFASST) as a space- and time-parallel solver for the Landau-Lifshitz-Gilbert equation (LLG). Numerical experiments show that PFASST enables efficient parallel-in-time integration of the LLG, significantly improving the speedup gained from using a given number of cores as well as allowing the code to scale beyond spatial limits.Comment: 9 pages, 8 figures, 4 table

    Proceedings of the YIC 2021 - VI ECCOMAS Young Investigators Conference

    Full text link
    The 6th ECCOMAS Young Investigators Conference YIC2021 will take place from July 7th through 9th, 2021 at Universitat Politècnica de València, Spain. The main objective is to bring together in a relaxed environment young students, researchers and professors from all areas related with computational science and engineering, as in the previous YIC conferences series organized under the auspices of the European Community on Computational Methods in Applied Sciences (ECCOMAS). Participation of senior scientists sharing their knowledge and experience is thus critical for this event.YIC 2021 is organized at Universitat Politécnica de València by the Sociedad Española de Métodos Numéricos en Ingeniería (SEMNI) and the Sociedad Española de Matemática Aplicada (SEMA). It is promoted by the ECCOMAS.The main goal of the YIC 2021 conference is to provide a forum for presenting and discussing the current state-of-the-art achievements on Computational Methods and Applied Sciences,including theoretical models, numerical methods, algorithmic strategies and challenging engineering applications.Nadal Soriano, E.; Rodrigo Cardiel, C.; Martínez Casas, J. (2022). Proceedings of the YIC 2021 - VI ECCOMAS Young Investigators Conference. Editorial Universitat Politècnica de València. https://doi.org/10.4995/YIC2021.2021.15320EDITORIA

    Parallel-in-space-time, adaptive finite element framework for non-linear parabolic equations

    Get PDF
    We present an adaptive methodology for the solution of (linear and) non-linear time dependent problems that is especially tailored for massively parallel computations. The basic concept is to solve for large blocks of space-time unknowns instead of marching sequentially in time. The methodology is a combination of a computationally efficient implementation of a parallel-in-space time finite element solver coupled with a posteriori space-time error estimates and a parallel mesh generator. While we focus on spatial adaptivity in this work, the methodology enables simultaneous adaptivity in both space and time domains. We explore this basic concept in the context of a variety of time-steppers including Θ-schemes and Backward Difference Formulas. We specifically illustrate this framework with applications involving time dependent linear, quasi-linear and semi-linear diffusion equations. We focus on investigating how the coupled space-time refinement indicators for this class of problems aspect spatial adaptivity. Finally, we show good scaling behavior up to 150,000 processors on the NCSA Blue Waters machine. This conceptually simple methodology enables scaling on next generation multi-core machines by simultaneously solving for large number of time-steps, and reduces computational overhead by locally refining spatial blocks that can track localized features. This methodology also opens up the possibility of efficiently incorporating adjoint equations for error estimators and invers

    Space-time block preconditioning for incompressible flow

    Full text link
    Parallel-in-time methods have become increasingly popular in the simulation of time-dependent numerical PDEs, allowing for the efficient use of additional MPI processes when spatial parallelism saturates. Most methods treat the solution and parallelism in space and time separately. In contrast, all-at-once methods solve the full space-time system directly, largely treating time as simply another spatial dimension. All-at-once methods offer a number of benefits over separate treatment of space and time, most notably significantly increased parallelism and faster time-to-solution (when applicable). However, the development of fast, scalable all-at-once methods has largely been limited to time-dependent (advection-)diffusion problems. This paper introduces the concept of space-time block preconditioning for the all-at-once solution of incompressible flow. By extending well-known concepts of spatial block preconditioning to the space-time setting, we develop a block preconditioner whose application requires the solution of a space-time (advection-)diffusion equation in the velocity block, coupled with a pressure Schur complement approximation consisting of independent spatial solves at each time-step, and a space-time matrix-vector multiplication. The new method is tested on four classical models in incompressible flow. Results indicate perfect scalability in refinement of spatial and temporal mesh spacing, perfect scalability in nonlinear Picard iterations count when applied to a nonlinear Navier-Stokes problem, and minimal overhead in terms of number of preconditioner applications compared with sequential time-stepping.Comment: 28 pages, 7 figures, 4 table
    corecore