183 research outputs found

    Local time stepping on high performance computing architectures: mitigating CFL bottlenecks for large-scale wave propagation

    Get PDF
    Modeling problems that require the simulation of hyperbolic PDEs (wave equations) on large heterogeneous domains have potentially many bottlenecks. We attack this problem through two techniques: the massively parallel capabilities of graphics processors (GPUs) and local time stepping (LTS) to mitigate any CFL bottlenecks on a multiscale mesh. Many modern supercomputing centers are installing GPUs due to their high performance, and extending existing seismic wave-propagation software to use GPUs is vitally important to give application scientists the highest possible performance. In addition to this architectural optimization, LTS schemes avoid performance losses in meshes with localized areas of refinement. Coupled with the GPU performance optimizations, the derivation and implementation of an Newmark LTS scheme enables next-generation performance for real-world applications. Included in this implementation is work addressing the load-balancing problem inherent to multi-level LTS schemes, enabling scalability to hundreds and thousands of CPUs and GPUs. These GPU, LTS, and scaling optimizations accelerate the performance of existing applications by a factor of 30 or more, and enable future modeling scenarios previously made unfeasible by the cost of standard explicit time-stepping schemes

    Group implicit concurrent algorithms in nonlinear structural dynamics

    Get PDF
    During the 70's and 80's, considerable effort was devoted to developing efficient and reliable time stepping procedures for transient structural analysis. Mathematically, the equations governing this type of problems are generally stiff, i.e., they exhibit a wide spectrum in the linear range. The algorithms best suited to this type of applications are those which accurately integrate the low frequency content of the response without necessitating the resolution of the high frequency modes. This means that the algorithms must be unconditionally stable, which in turn rules out explicit integration. The most exciting possibility in the algorithms development area in recent years has been the advent of parallel computers with multiprocessing capabilities. So, this work is mainly concerned with the development of parallel algorithms in the area of structural dynamics. A primary objective is to devise unconditionally stable and accurate time stepping procedures which lend themselves to an efficient implementation in concurrent machines. Some features of the new computer architecture are summarized. A brief survey of current efforts in the area is presented. A new class of concurrent procedures, or Group Implicit algorithms is introduced and analyzed. The numerical simulation shows that GI algorithms hold considerable promise for application in coarse grain as well as medium grain parallel computers

    Dynamic earthquake rupture modelled with an unstructured 3-D spectral element method applied to the 2011 M9 Tohoku earthquake

    Get PDF
    An important goal of computational seismology is to simulate dynamic earthquake rupture and strong ground motion in realistic models that include crustal heterogeneities and complex fault geometries. To accomplish this, we incorporate dynamic rupture modelling capabilities in a spectral element solver on unstructured meshes, the 3-D open source code SPECFEM3D, and employ state-of-the-art software for the generation of unstructured meshes of hexahedral elements. These tools provide high flexibility in representing fault systems with complex geometries, including faults with branches and non-planar faults. The domain size is extended with progressive mesh coarsening to maintain an accurate resolution of the static field. Our implementation of dynamic rupture does not affect the parallel scalability of the code. We verify our implementation by comparing our results to those of two finite element codes on benchmark problems including branched faults. Finally, we present a preliminary dynamic rupture model of the 2011 Mw 9.0 Tohoku earthquake including a non-planar plate interface with heterogeneous frictional properties and initial stresses. Our simulation reproduces qualitatively the depth-dependent frequency content of the source and the large slip close to the trench observed for this earthquak

    Dynamic earthquake rupture modelled with an unstructured 3-D spectral element method applied to the 2011 M9 Tohoku earthquake

    Get PDF
    An important goal of computational seismology is to simulate dynamic earthquake rupture and strong ground motion in realistic models that include crustal heterogeneities and complex fault geometries. To accomplish this, we incorporate dynamic rupture modelling capabilities in a spectral element solver on unstructured meshes, the 3-D open source code SPECFEM3D, and employ state-of-the-art software for the generation of unstructured meshes of hexahedral elements. These tools provide high flexibility in representing fault systems with complex geometries, including faults with branches and non-planar faults. The domain size is extended with progressive mesh coarsening to maintain an accurate resolution of the static field. Our implementation of dynamic rupture does not affect the parallel scalability of the code. We verify our implementation by comparing our results to those of two finite element codes on benchmark problems including branched faults. Finally, we present a preliminary dynamic rupture model of the 2011 M_w 9.0 Tohoku earthquake including a non-planar plate interface with heterogeneous frictional properties and initial stresses. Our simulation reproduces qualitatively the depth-dependent frequency content of the source and the large slip close to the trench observed for this earthquake

    Large-scale structural analysis: The structural analyst, the CSM Testbed and the NAS System

    Get PDF
    The Computational Structural Mechanics (CSM) activity is developing advanced structural analysis and computational methods that exploit high-performance computers. Methods are developed in the framework of the CSM testbed software system and applied to representative complex structural analysis problems from the aerospace industry. An overview of the CSM testbed methods development environment is presented and some numerical methods developed on a CRAY-2 are described. Selected application studies performed on the NAS CRAY-2 are also summarized

    Improving programmability and performance for scientific applications

    Get PDF
    With modern advancements in hardware and software technology scaling towards new limits, our compute machines are reaching new potentials to tackle more challenging problems. While the size and complexity of both the problems and solutions increases, the programming methodologies must remain at a level that can be understood by programmers and scientists alike. In our work, this problem is encountered when developing an optimized framework to best exploit the semantic properties of a finite-element solver. In efforts to address this problem, we explore programming and runtime models which decouple algorithmic complexity, parallelism concerns, and hardware mapping. We build upon these frameworks to exploit domain-specific semantics using high-level transformations and modifications to obtain performance through algorithmic and runtime optimizations. We first discusses optimizations performed on a computational mechanics solver using a novel coupling technique for multi-time scale methods for discrete finite element domains. We exploit domain semantics using a high-level dynamic runtime scheme to reorder and balance workloads to greatly improve runtime performance. The framework presented automatically chooses a near-optimal coupling solution and runs a work-stealing parallel executor to run effectively on multi-core systems. In my latter work, I focus on the parallel programming model, Concurrent Collections (CnC), to seamlessly bridge the gap between performance and programmability. Because challenging problems in various domains, not limited to computation mechanics, requires both domain expertise and programming prowess, there is a need for ways to separate those concerns. This thesis describes methods and techniques to obtain scalable performance using CnC programming while limiting the burden of programming. These high level techniques are presented for two high-performance applications corresponding to hydrodynamics and multi-grid solvers

    Convergence analysis of energy conserving explicit local time-stepping methods for the wave equation

    Get PDF
    Local adaptivity and mesh refinement are key to the efficient simulation of wave phenomena in heterogeneous media or complex geometry. Locally refined meshes, however, dictate a small time-step everywhere with a crippling effect on any explicit time-marching method. In [18] a leap-frog (LF) based explicit local time-stepping (LTS) method was proposed, which overcomes the severe bottleneck due to a few small elements by taking small time-steps in the locally refined region and larger steps elsewhere. Here a rigorous convergence proof is presented for the fully-discrete LTS-LF method when combined with a standard conforming finite element method (FEM) in space. Numerical results further illustrate the usefulness of the LTS-LF Galerkin FEM in the presence of corner singularities

    Uncertainty Quantification by MLMC and Local Time-stepping For Wave Propagation

    Get PDF
    Because of their robustness, efficiency and non-intrusiveness, Monte Carlo methods are probably the most popular approach in uncertainty quantification to computing expected values of quantities of interest (QoIs). Multilevel Monte Carlo (MLMC) methods significantly reduce the computational cost by distributing the sampling across a hierarchy of discretizations and allocating most samples to the coarser grids. For time dependent problems, spatial coarsening typically entails an increased time-step. Geometric constraints, however, may impede uniform coarsening thereby forcing some elements to remain small across all levels. If explicit time-stepping is used, the time-step will then be dictated by the smallest element on each level for numerical stability. Hence, the increasingly stringent CFL condition on the time-step on coarser levels significantly reduces the advantages of the multilevel approach. By adapting the time-step to the locally refined elements on each level, local time-stepping (LTS) methods permit to restore the efficiency of MLMC methods even in the presence of complex geometry without sacrificing the explicitness and inherent parallelism
    • …
    corecore