99 research outputs found

    Extensible Component Based Architecture for FLASH, A Massively Parallel, Multiphysics Simulation Code

    Full text link
    FLASH is a publicly available high performance application code which has evolved into a modular, extensible software system from a collection of unconnected legacy codes. FLASH has been successful because its capabilities have been driven by the needs of scientific applications, without compromising maintainability, performance, and usability. In its newest incarnation, FLASH3 consists of inter-operable modules that can be combined to generate different applications. The FLASH architecture allows arbitrarily many alternative implementations of its components to co-exist and interchange with each other, resulting in greater flexibility. Further, a simple and elegant mechanism exists for customization of code functionality without the need to modify the core implementation of the source. A built-in unit test framework providing verifiability, combined with a rigorous software maintenance process, allow the code to operate simultaneously in the dual mode of production and development. In this paper we describe the FLASH3 architecture, with emphasis on solutions to the more challenging conflicts arising from solver complexity, portable performance requirements, and legacy codes. We also include results from user surveys conducted in 2005 and 2007, which highlight the success of the code.Comment: 33 pages, 7 figures; revised paper submitted to Parallel Computin

    Parallel block structured adaptive mesh refinement on graphics processing units.

    Get PDF
    Block-structured adaptive mesh refinement is a technique that can be used when solving partial differential equations to reduce the number of zones necessary to achieve the required accuracy in areas of interest. These areas (shock fronts, material interfaces, etc.) are recursively covered with finer mesh patches that are grouped into a hierarchy of refinement levels. Despite the potential for large savings in computational requirements and memory usage without a corresponding reduction in accuracy, AMR adds overhead in managing the mesh hierarchy, adding complex communication and data movement requirements to a simulation. In this paper, we describe the design and implementation of a native GPU-based AMR library, including: the classes used to manage data on a mesh patch, the routines used for transferring data between GPUs on different nodes, and the data-parallel operators developed to coarsen and refine mesh data. We validate the performance and accuracy of our implementation using three test problems and two architectures: an eight-node cluster, and over four thousand nodes of Oak Ridge National Laboratory’s Titan supercomputer. Our GPU-based AMR hydrodynamics code performs up to 4.87x faster than the CPU-based implementation, and has been scaled to over four thousand GPUs using a combination of MPI and CUDA

    Resident block-structured adaptive mesh refinement on thousands of graphics processing units

    Get PDF
    Block-structured adaptive mesh refinement (AMR) is a technique that can be used when solving partial differential equations to reduce the number of cells necessary to achieve the required accuracy in areas of interest. These areas (shock fronts, material interfaces, etc.) are recursively covered with finer mesh patches that are grouped into a hierarchy of refinement levels. Despite the potential for large savings in computational requirements and memory usage without a corresponding reduction in accuracy, AMR adds overhead in managing the mesh hierarchy, adding complex communication and data movement requirements to a simulation. In this paper, we describe the design and implementation of a resident GPU-based AMR library, including: the classes used to manage data on a mesh patch, the routines used for transferring data between GPUs on different nodes, and the data-parallel operators developed to coarsen and refine mesh data. We validate the performance and accuracy of our implementation using three test problems and two architectures: an 8 node cluster, and 4,196 nodes of Oak Ridge National Laboratory’s Titan supercomputer. Our GPU-based AMR hydrodynamics code performs up to 4.87x faster than the CPU-based implementation, and is scalable on 4,196 K20x GPUs using a combination of MPI and CUDA

    Fast, accurate solutions for curvilinear earthquake faults and anelastic strain

    Get PDF
    Imaging the anelastic deformation within the crust and lithosphere using surface geophysical data remains a significant challenge in part due to the wide range of physical processes operating at different depths and to various levels of localization that they embody. Models of Earth's elastic properties from seismological imaging combined with geodetic modeling may form the basis of comprehensive rheological models of Earth's interior. However, representing the structural complexity of faults and shear zones in numerical models of deformation still constitutes a major difficulty. Here, we present numerical techniques for high-precision models of deformation and stress around both curvilinear faults and volumes undergoing anelastic (irreversible) strain in a heterogenous elastic half-space. To that end, we enhance the software Gamra to model triangular and rectangular fault patches and tetrahedral and cuboidal strain volumes. This affords a means of rapid and accurate calculations of elasto-static Green's functions for localized (e.g., faulting) and distributed (e.g., viscoelastic) deformation in Earth's crust and lithosphere. We demonstrate the correctness of the method with analytic tests, and we illustrate its practical performance by solving for coseismic and postseismic deformation following the 2015 Mw 7.8 Gorkha, Nepal earthquake to extremely high precision

    Hybrid finite difference/finite element immersed boundary method

    Get PDF
    The immersed boundary method is an approach to fluid-structure interaction that uses a Lagrangian description of the structural deformations, stresses, and forces along with an Eulerian description of the momentum, viscosity, and incompressibility of the fluid-structure system. The original immersed boundary methods described immersed elastic structures using systems of flexible fibers, and even now, most immersed boundary methods still require Lagrangian meshes that are finer than the Eulerian grid. This work introduces a coupling scheme for the immersed boundary method to link the Lagrangian and Eulerian variables that facilitates independent spatial discretizations for the structure and background grid. This approach employs a finite element discretization of the structure while retaining a finite difference scheme for the Eulerian variables. We apply this method to benchmark problems involving elastic, rigid, and actively contracting structures, including an idealized model of the left ventricle of the heart. Our tests include cases in which, for a fixed Eulerian grid spacing, coarser Lagrangian structural meshes yield discretization errors that are as much as several orders of magnitude smaller than errors obtained using finer structural meshes. The Lagrangian-Eulerian coupling approach developed in this work enables the effective use of these coarse structural meshes with the immersed boundary method. This work also contrasts two different weak forms of the equations, one of which is demonstrated to be more effective for the coarse structural discretizations facilitated by our coupling approach

    Towards scalable adaptive mesh refinement on future parallel architectures

    Get PDF
    In the march towards exascale, supercomputer architectures are undergoing a significant change. Limited by power consumption and heat dissipation, future supercomputers are likely to be built around a lower-power many-core model. This shift in supercomputer design will require sweeping code changes in order to take advantage of the highly-parallel architectures. Evolving or rewriting legacy applications to perform well on these machines is a significant challenge. Mini-applications, small computer programs that represent the performance characteristics of some larger application, can be used to investigate new programming models and improve the performance of the legacy application by proxy. These applications, being both easy to modify and representative, are essential for establishing a path to move legacy applications into the exascale era. The focus of the work presented in this thesis is the design, development and employment of a new mini-application, CleverLeaf, for shock hydro- dynamics with block-structured adaptive mesh refinement (AMR). We report on the development of CleverLeaf, and show how the fresh start provided by a mini-application can be used to develop an application that is flexible, accurate, and easy to employ in the investigation of exascale architectures. We also detail the development of the first reported resident parallel block-structured AMR library for Graphics Processing Units (GPUs). Extending the SAMRAI library using the CUDA programming model, we develop datatypes that store data only in GPU memory, as well the necessary operators for moving and interpolating data on an adaptive mesh. We show that executing AMR simulations on a GPU is up to 4.8⇥ faster than a CPU, and demonstrate scalability on over 4,000 nodes using a combination of CUDA and MPI. Finally, we show how mini-applications can be employed to improve the performance of production applications on existing parallel architectures by selecting the optimal application configuration. Using CleverLeaf, we identify the most appropriate configurations on three contemporary supercomputer architectures. Selecting the best parameters for our application can reduce run-time by up to 82% and reduce memory usage by up to 32%

    Quasi-static imaged-based immersed boundary-finite element model of human left ventricle in diastole

    Get PDF
    SUMMARY: Finite stress and strain analyses of the heart provide insight into the biomechanics of myocardial function and dysfunction. Herein, we describe progress toward dynamic patient-specific models of the left ventricle using an immersed boundary (IB) method with a finite element (FE) structural mechanics model. We use a structure-based hyperelastic strain-energy function to describe the passive mechanics of the ventricular myocardium, a realistic anatomical geometry reconstructed from clinical magnetic resonance images of a healthy human heart, and a rule-based fiber architecture. Numerical predictions of this IB/FE model are compared with results obtained by a commercial FE solver. We demonstrate that the IB/FE model yields results that are in good agreement with those of the conventional FE model under diastolic loading conditions, and the predictions of the LV model using either numerical method are shown to be consistent with previous computational and experimental data. These results are among the first to analyze the stress and strain predictions of IB models of ventricular mechanics, and they serve both to verify the IB/FE simulation framework and to validate the IB/FE model. Moreover, this work represents an important step toward using such models for fully dynamic fluid–structure interaction simulations of the heart

    Dynamic finite-strain modelling of the human left ventricle in health and disease using an immersed boundary-finite element method

    Get PDF
    Detailed models of the biomechanics of the heart are important both for developing improved interventions for patients with heart disease and also for patient risk stratification and treatment planning. For instance, stress distributions in the heart affect cardiac remodelling, but such distributions are not presently accessible in patients. Biomechanical models of the heart offer detailed three-dimensional deformation, stress and strain fields that can supplement conventional clinical data. In this work, we introduce dynamic computational models of the human left ventricle (LV) that are derived from clinical imaging data obtained from a healthy subject and from a patient with a myocardial infarction (MI). Both models incorporate a detailed invariant-based orthotropic description of the passive elasticity of the ventricular myocardium along with a detailed biophysical model of active tension generation in the ventricular muscle. These constitutive models are employed within a dynamic simulation framework that accounts for the inertia of the ventricular muscle and the blood that is based on an immersed boundary (IB) method with a finite element description of the structural mechanics. The geometry of the models is based on data obtained non-invasively by cardiac magnetic resonance (CMR). CMR imaging data are also used to estimate the parameters of the passive and active constitutive models, which are determined so that the simulated end-diastolic and end-systolic volumes agree with the corresponding volumes determined from the CMR imaging studies. Using these models, we simulate LV dynamics from end-diastole to end-systole. The results of our simulations are shown to be in good agreement with subject-specific CMR-derived strain measurements and also with earlier clinical studies on human LV strain distributions

    Exploring performance data with boxfish

    Get PDF
    pre-printThe growth in size and complexity of scaling applications and the systems on which they run pose challenges in analyzing and improving their overall performance. With metrics coming from thousands or millions of processes, visualization techniques are necessary to make sense of the increasing amount of data. To aid the process of exploration and understanding, we announce the initial release of Boxfish, an extensible tool for manipulating and visualizing data pertaining to application behavior. Combining and visually presenting data and knowledge from multiple domains, such as the application's communication patterns and the hardware's network configuration and routing policies, can yield the insight necessary to discover the underlying causes of observed behavior. Boxfish allows users to query, filter and project data across these domains to create interactive, linked visualizations
    • …
    corecore