30 research outputs found

    Scalable hierarchical parallel algorithm for the solution of super large-scale sparse linear equations

    Full text link
    The parallel linear equations solver capable of effectively using 1000+ processors becomes the bottleneck of large-scale implicit engineering simulations. In this paper, we present a new hierarchical parallel master-slave-structural iterative algorithm for the solution of super large-scale sparse linear equations in distributed memory computer cluster. Through alternatively performing global equilibrium computation and local relaxation, our proposed algorithm will reach the specific accuracy requirement in a few of iterative steps. Moreover, each set/slave-processor majorly communicate with its nearest neighbors, and the transferring data between sets/slave-processors and master is always far below the set-neighbor communication. The corresponding algorithm for implicit finite element analysis has been implemented based on MPI library, and a super large 2-dimension square system of triangle-lattice truss structure under random static loads is simulated with over one billion degrees of freedom and up to 2001 processors on "Exploration 100" cluster in Tsinghua University. The numerical experiments demonstrate that this algorithm has excellent parallel efficiency and high scalability, and it may have broad application in other implicit simulations.Comment: 23 page, 9 figures 1 tabl

    Computationally-Optimized Bone Mechanical Modeling from High-Resolution Structural Images

    Get PDF
    Image-based mechanical modeling of the complex micro-structure of human bone has shown promise as a non-invasive method for characterizing bone strength and fracture risk in vivo. In particular, elastic moduli obtained from image-derived micro-finite element (μFE) simulations have been shown to correlate well with results obtained by mechanical testing of cadaveric bone. However, most existing large-scale finite-element simulation programs require significant computing resources, which hamper their use in common laboratory and clinical environments. In this work, we theoretically derive and computationally evaluate the resources needed to perform such simulations (in terms of computer memory and computation time), which are dependent on the number of finite elements in the image-derived bone model. A detailed description of our approach is provided, which is specifically optimized for μFE modeling of the complex three-dimensional architecture of trabecular bone. Our implementation includes domain decomposition for parallel computing, a novel stopping criterion, and a system for speeding up convergence by pre-iterating on coarser grids. The performance of the system is demonstrated on a dual quad-core Xeon 3.16 GHz CPUs equipped with 40 GB of RAM. Models of distal tibia derived from 3D in-vivo MR images in a patient comprising 200,000 elements required less than 30 seconds to converge (and 40 MB RAM). To illustrate the system's potential for large-scale μFE simulations, axial stiffness was estimated from high-resolution micro-CT images of a voxel array of 90 million elements comprising the human proximal femur in seven hours CPU time. In conclusion, the system described should enable image-based finite-element bone simulations in practical computation times on high-end desktop computers with applications to laboratory studies and clinical imaging

    Influence of Vertical Trabeculae on the Compressive Strength of the Human Vertebra

    Get PDF
    Vertebral strength, a key etiologic factor of osteoporotic fracture, may be affected by the relative amount of vertically oriented trabeculae. To better understand this issue, we performed experimental compression testing, high-resolution micro–computed tomography (µCT), and micro–finite-element analysis on 16 elderly human thoracic ninth (T9) whole vertebral bodies (ages 77.5 ± 10.1 years). Individual trabeculae segmentation of the µCT images was used to classify the trabeculae by their orientation. We found that the bone volume fraction (BV/TV) of just the vertical trabeculae accounted for substantially more of the observed variation in measured vertebral strength than did the bone volume fraction of all trabeculae (r2 = 0.83 versus 0.59, p < .005). The bone volume fraction of the oblique or horizontal trabeculae was not associated with vertebral strength. Finite-element analysis indicated that removal of the cortical shell did not appreciably alter these trends; it also revealed that the major load paths occur through parallel columns of vertically oriented bone. Taken together, these findings suggest that variation in vertebral strength across individuals is due primarily to variations in the bone volume fraction of vertical trabeculae. The vertical tissue fraction, a new bone quality parameter that we introduced to reflect these findings, was both a significant predictor of vertebral strength alone (r2 = 0.81) and after accounting for variations in total bone volume fraction in multiple regression (total R2 = 0.93). We conclude that the vertical tissue fraction is a potentially powerful microarchitectural determinant of vertebral strength. © 2011 American Society for Bone and Mineral Research

    Performance Portable Solid Mechanics via Matrix-Free pp-Multigrid

    Full text link
    Finite element analysis of solid mechanics is a foundational tool of modern engineering, with low-order finite element methods and assembled sparse matrices representing the industry standard for implicit analysis. We use performance models and numerical experiments to demonstrate that high-order methods greatly reduce the costs to reach engineering tolerances while enabling effective use of GPUs. We demonstrate the reliability, efficiency, and scalability of matrix-free pp-multigrid methods with algebraic multigrid coarse solvers through large deformation hyperelastic simulations of multiscale structures. We investigate accuracy, cost, and execution time on multi-node CPU and GPU systems for moderate to large models using AMD MI250X (OLCF Crusher), NVIDIA A100 (NERSC Perlmutter), and V100 (LLNL Lassen and OLCF Summit), resulting in order of magnitude efficiency improvements over a broad range of model properties and scales. We discuss efficient matrix-free representation of Jacobians and demonstrate how automatic differentiation enables rapid development of nonlinear material models without impacting debuggability and workflows targeting GPUs

    A computational assessment of the independent contribution of changes in canine trabecular bone volume fraction and microarchitecture to increased bone strength with suppression of bone turnover

    Get PDF
    This study addressed the effects of changes in trabecular microarchitecture induced by suppressed bone turnover—including changes to the remodeling space—on the trabecular bone strength–volume fraction characteristics independent of changes in tissue material properties. Twenty female beagle dogs, aged 1–2 years, were treated daily with either oral saline (n=10 control) or high doses of oral risedronate (0.5 mg/kg/day, n=10 suppressed) for a period of 1 year, the latter designed (and confirmed) to substantially suppress bone turnover. High-resolution micro-CT-based finite element models (18-μm voxel size) of canine trabecular bone cores (n=2 per vertebral body) extracted from the T-10 vertebrae were analyzed in both compressive and torsional loading cases. The same tissue-level material properties were used in all models, thus providing measures of tissue-normalized strength due only to changes in the microarchitecture. Suppressed bone turnover resulted in more plate-like architecture with a thicker and more dense trabecular structure, but the relationship between the microarchitectural parameters and volume fraction was unaltered (p>0.05). Though the suppressed group had a greater tissue-normalized strength as compared to the control group (p0.13) or torsion (p>0.09). In this high-density, non-osteoporotic animal model, the increases in tissue-normalized strength seen with suppression of bone turnover were entirely commensurate with increases in bone volume fraction and thus, no evidence of microarchitecture-related or “stress-riser” effects which may disproportionately affect strength were found

    Effects of suppression of bone turnover on cortical and trabecular load sharing in the canine vertebral body

    Get PDF
    The relative biomechanical effects of antiresorptive treatment on cortical thickness vs. trabecular bone microarchitecture in the spine are not well understood. To address this, T-10 vertebral bodies were analyzed from skeletally mature female beagle dogs that had been treated with oral saline (n=8 control) or a high dose of oral risedronate (0.5 mg/kg/day, n=9 RIS-suppressed) for 1 year. Two linearly elastic finite element models (36-μm voxel size) were generated for each vertebral body—a whole-vertebra model and a trabecular-compartment model—and subjected to uniform compressive loading. Tissue-level material properties were kept constant to isolate the effects of changes in microstructure alone. Suppression of bone turnover resulted in increased stiffness of the whole vertebra (20.9%, p=0.02) and the trabecular compartment (26.0%, p=0.01), while the computed stiffness of the cortical shell (difference between whole-vertebra and trabecular-compartment stiffnesses, 11.7%, p=0.15) was statistically unaltered. Regression analyses indicated subtle but significant changes in the relative structural roles of the cortical shell and the trabecular compartment. Despite higher average cortical shell thickness in RIS-suppressed vertebrae (23.1%, p=0.002), the maximum load taken by the shell for a given value of shell mass fraction was lower (p=0.005) for the RIS-suppressed group. Taken together, our results suggest that—in this canine model—the overall changes in the compressive stiffness of the vertebral body due to suppression of bone turnover were attributable more to the changes in the trabecular compartment than in the cortical shell. Such biomechanical studies provide an unique insight into higher-scale effects such as the biomechanical responses of the whole vertebra

    Progress Towards Petascale Applications in Biology: Status in 2006

    Get PDF
    Petascale computing is currently a common topic of discussion in the high performance computing community. Biological applications, particularly protein folding, are often given as examples of the need for petascale computing. There are at present biological applications that scale to execution rates of approximately 55 teraflops on a special-purpose supercomputer and 2.2 teraflops on a general-purpose supercomputer. In comparison, Qbox, a molecular dynamics code used to model metals, has an achieved performance of 207.3 teraflops. It may be useful to increase the extent to which operation rates and total calculations are reported in discussion of biological applications, and use total operations (integer and floating point combined) rather than (or in addition to) floating point operations as the unit of measure. Increased reporting of such metrics will enable better tracking of progress as the research community strives for the insights that will be enabled by petascale computing.This research was supported in part by the Indiana Genomics Initiative and the Indiana Metabolomics and Cytomics Initiative. The Indiana Genomics Initiative of Indiana University and the Indiana Metabolomics and Cytomics Initiative of Indiana University are supported in part by Lilly Endowment, Inc. The authors also wish to thank IBM, Inc. for support via Shared University Research Grants and partnerships via IU’s relationship as an IBM Life Sciences Institute of Innovation. Indiana University also thanks the TeraGrid partners; IU’s participation in the TeraGrid is funded by National Science Foundation grant numbers 0338618, 0504075, and 0451237. The early development of this paper was supported by a Fulbright Senior Scholars award from the Council for International Exchange of Scholars (CIES) and the United States Department of State to Dr. Craig A. Stewart; Matthias Mueller and the Technische Universität Dresden were hosts. Many reviewers contributed to the improvement of the ideas expressed in this paper and are gratefully appreciated; Thom Dunning, Robert Germain, Chris Mueller, Jim Phillips, Richard Repasky, Ralph Roskies, and Allan Snavely are thanked particularly for their insights
    corecore