17 research outputs found

    Performance Enhancements for the Lattice-Boltzmann Solver in the LAVA Framework

    Get PDF
    Performance enhancements in NASA's recently developed Lattice Boltzmann solver within the Launch Ascent and Vehicle Aerodynamics (LAVA) framework are presented. Two key algorithmic developments are highlighted. A coarse-fine interface treatment that discretely conserves mass and momentum has been implemented and successfully verified and validated. Code optimizations targeting improved serial and parallel performance were presented. For a simple turbulent Taylor-Green Vortex problem, we were able to demonstrate a 2.3 times speedup over the baseline code for a single Skylake-SP node containing 40 physical cores, and a 2.14 times speedup for 64 nodes containing 2560 physical cores. In addition, we were able to show that the optimizations enabled us to scale the code almost perfectly to 20480 physical cores where, including ghost cells, the problem size was 10 billion cells

    A Scalable and Modular Software Architecture for Finite Elements on Hierarchical Hybrid Grids

    Full text link
    In this article, a new generic higher-order finite-element framework for massively parallel simulations is presented. The modular software architecture is carefully designed to exploit the resources of modern and future supercomputers. Combining an unstructured topology with structured grid refinement facilitates high geometric adaptability and matrix-free multigrid implementations with excellent performance. Different abstraction levels and fully distributed data structures additionally ensure high flexibility, extensibility, and scalability. The software concepts support sophisticated load balancing and flexibly combining finite element spaces. Example scenarios with coupled systems of PDEs show the applicability of the concepts to performing geophysical simulations.Comment: Preprint of an article submitted to International Journal of Parallel, Emergent and Distributed Systems (Taylor & Francis

    Dynamic Load Balancing Techniques for Particulate Flow Simulations

    Full text link
    Parallel multiphysics simulations often suffer from load imbalances originating from the applied coupling of algorithms with spatially and temporally varying workloads. It is thus desirable to minimize these imbalances to reduce the time to solution and to better utilize the available hardware resources. Taking particulate flows as an illustrating example application, we present and evaluate load balancing techniques that tackle this challenging task. This involves a load estimation step in which the currently generated workload is predicted. We describe in detail how such a workload estimator can be developed. In a second step, load distribution strategies like space-filling curves or graph partitioning are applied to dynamically distribute the load among the available processes. To compare and analyze their performance, we employ these techniques to a benchmark scenario and observe a reduction of the load imbalances by almost a factor of four. This results in a decrease of the overall runtime by 14% for space-filling curves
    corecore