23 research outputs found

    Best practices for HPM-assisted performance engineering on modern multicore processors

    Full text link
    Many tools and libraries employ hardware performance monitoring (HPM) on modern processors, and using this data for performance assessment and as a starting point for code optimizations is very popular. However, such data is only useful if it is interpreted with care, and if the right metrics are chosen for the right purpose. We demonstrate the sensible use of hardware performance counters in the context of a structured performance engineering approach for applications in computational science. Typical performance patterns and their respective metric signatures are defined, and some of them are illustrated using case studies. Although these generic concepts do not depend on specific tools or environments, we restrict ourselves to modern x86-based multicore processors and use the likwid-perfctr tool under the Linux OS.Comment: 10 pages, 2 figure

    Eigen-AD: Algorithmic Differentiation of the Eigen Library

    Full text link
    In this work we present useful techniques and possible enhancements when applying an Algorithmic Differentiation (AD) tool to the linear algebra library Eigen using our in-house AD by overloading (AD-O) tool dco/c++ as a case study. After outlining performance and feasibility issues when calculating derivatives for the official Eigen release, we propose Eigen-AD, which enables different optimization options for an AD-O tool by providing add-on modules for Eigen. The range of features includes a better handling of expression templates for general performance improvements, as well as implementations of symbolically derived expressions for calculating derivatives of certain core operations. The software design allows an AD-O tool to provide specializations to automatically include symbolic operations and thereby keep the look and feel of plain AD by overloading. As a showcase, dco/c++ is provided with such a module and its significant performance improvements are validated by benchmarks.Comment: Updated with accepted version for ICCS 2020 conference proceedings. The final authenticated publication is available online at https://doi.org/10.1007/978-3-030-50371-0_51. See v1 for the original, extended preprint. 14 pages, 7 figure

    Scalable Simulation of Realistic Volume Fraction Red Blood Cell Flows through Vascular Networks

    Full text link
    High-resolution blood flow simulations have potential for developing better understanding biophysical phenomena at the microscale, such as vasodilation, vasoconstriction and overall vascular resistance. To this end, we present a scalable platform for the simulation of red blood cell (RBC) flows through complex capillaries by modeling the physical system as a viscous fluid with immersed deformable particles. We describe a parallel boundary integral equation solver for general elliptic partial differential equations, which we apply to Stokes flow through blood vessels. We also detail a parallel collision avoiding algorithm to ensure RBCs and the blood vessel remain contact-free. We have scaled our code on Stampede2 at the Texas Advanced Computing Center up to 34,816 cores. Our largest simulation enforces a contact-free state between four billion surface elements and solves for three billion degrees of freedom on one million RBCs and a blood vessel composed from two million patches

    Massively parallel rigid body dynamics simulations

    No full text
    corecore