8 research outputs found

    Scalable Simulation of Realistic Volume Fraction Red Blood Cell Flows through Vascular Networks

    Full text link
    High-resolution blood flow simulations have potential for developing better understanding biophysical phenomena at the microscale, such as vasodilation, vasoconstriction and overall vascular resistance. To this end, we present a scalable platform for the simulation of red blood cell (RBC) flows through complex capillaries by modeling the physical system as a viscous fluid with immersed deformable particles. We describe a parallel boundary integral equation solver for general elliptic partial differential equations, which we apply to Stokes flow through blood vessels. We also detail a parallel collision avoiding algorithm to ensure RBCs and the blood vessel remain contact-free. We have scaled our code on Stampede2 at the Texas Advanced Computing Center up to 34,816 cores. Our largest simulation enforces a contact-free state between four billion surface elements and solves for three billion degrees of freedom on one million RBCs and a blood vessel composed from two million patches

    Scalable Parallel Delaunay Image-to-Mesh Conversion for Shared and Distributed Memory Architectures

    Get PDF
    Mesh generation is an essential component for many engineering applications. The ability to generate meshes in parallel is critical for the scalability of the entire Finite Element Method (FEM) pipeline. However, parallel mesh generation applications belong to the broader class of adaptive and irregular problems, and are among the most complex, challenging, and labor intensive to develop and maintain. In this thesis, we summarize several years of the progress that we made in a novel framework for highly scalable and guaranteed quality mesh generation for finite element analysis in three dimensions. We studied and developed parallel mesh generation algorithms on both shared and distributed memory architectures. In this thesis we present a novel two-level parallel tetrahedral mesh generation framework capable of delivering and sustaining close to 6000 of concurrent work units (cores). We achieve this by leveraging concurrency at two different granularity levels by using a hybrid message passing and multi-threaded execution model which is suitable to the hierarchy of the hardware architecture of the distributed memory clusters. An end-user productivity and scalability study was performed on up to 6000 cores, and indicated very good end-user productivity with about 300 million tets per second and about 3600 weak scaling speedup. Both of these results suggest that: compared to the best previous algorithm, we have seen an improvement of more than 7000 times in performance, measured in terms of speed (elements per second) by using about 180 times more CPUs, for geometries that are by many orders of magnitude more complex

    Development and Application of Numerical Methods in Biomolecular Solvation

    Full text link
    This work addresses the development of fast summation methods for long range particle interactions and their application to problems in biomolecular solvation, which describes the interaction of proteins or other biomolecules with their solvent environment. At the core of this work are treecodes, tree-based fast summation methods which, for N particles, reduce the cost of computing particle interactions from O(N^2) to O(N log N). Background on fast summation methods and treecodes in particular, as well as several treecode improvements developed in the early stages of this work, are presented. Building on treecodes, dual tree traversal (DTT) methods are another class of tree-based fast summation methods which reduce the cost of computing particle interactions for N particles to O(N). The primary result of this work is the development of an O(N) dual tree traversal fast summation method based on barycentric Lagrange polynomial interpolation (BLDTT). This method is implemented to run across multiple GPU compute nodes in the software package BaryTree. Across different problem sizes, particle distributions, geometries, and interaction kernels, the BLDTT shows consistently better performance than the previously developed barycentric Lagrange treecode (BLTC). The first major biomolecular solvation application of fast summation methods presented is to the Poisson–Boltzmann implicit solvent model, and in particular, the treecode-accelerated boundary integral Poisson–Boltzmann solver (TABI-PB). The work on TABI-PB consists of three primary projects and an application. The first project investigates the impact of various biomolecular surface meshing codes on TABI-PB, and integrated the NanoShaper software into the package, resulting in significantly better performance. Second, a node patch method for discretizing the system of integral equations is introduced to replace the previous centroid collocation scheme, resulting in faster convergence of solvation energies. Third, a new version of TABI-PB with GPU acceleration based on the BLDTT is developed, resulting in even more scalability. An application investigating the binding of biomolecular complexes is undertaken using the previous Taylor treecode-based version of TABI-PB. In addition to these projects, work performed over the course of this thesis integrated TABI-PB into the popular Adaptive Poisson–Boltzmann Solver (APBS) developed at Pacific Northwest National Laboratory. The second major application of fast summation methods is to the 3D reference interaction site model (3D-RISM), a statistical-mechanics based continuum solvation model. This work applies cluster-particle Taylor expansion treecodes to treat long-range asymptotic Coulomb-like potentials in 3D-RISM, and results in significant speedups and improved scalability to the 3D-RISM package implemented in AmberTools. Additionally, preliminary work on specialized GPU-accelerated treecodes based on BaryTree for 3D-RISM long-range asymptotic functions is presented.PHDApplied and Interdisciplinary MathematicsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/168120/1/lwwilson_1.pd
    corecore