4 research outputs found

    Flare: Flexible In-Network Allreduce

    Full text link
    The allreduce operation is one of the most commonly used communication routines in distributed applications. To improve its bandwidth and to reduce network traffic, this operation can be accelerated by offloading it to network switches, that aggregate the data received from the hosts, and send them back the aggregated result. However, existing solutions provide limited customization opportunities and might provide suboptimal performance when dealing with custom operators and data types, with sparse data, or when reproducibility of the aggregation is a concern. To deal with these problems, in this work we design a flexible programmable switch by using as a building block PsPIN, a RISC-V architecture implementing the sPIN programming model. We then design, model, and analyze different algorithms for executing the aggregation on this architecture, showing performance improvements compared to state-of-the-art approaches

    Effect of Heart Structure on Ventricular Fibrillation in the Rabbit: A Simulation Study

    Get PDF
    Ventricular fibrillation (VF) is a lethal condition that affects millions worldwide. The mechanism underlying VF is unstable reentrant electrical waves rotating around lines called filaments. These complex spatio-temporal patterns can be studied using both experimental and numerical methods. Computer simulations provide unique insights including high resolution dynamics throughout the heart and systematic control of quantities such as fiber orientation and cellular kinetics that are not feasible experimentally. Here we study filament dynamics using two bi-ventricular 3-D high-resolution rabbit heart geometries, one with detailed fine structure and another without fine structure. We studied filament dynamics using anisotropic and isotropic conductivities, and with four cellular action potential models with different recovery kinetics. Spiral wave dynamics observed in isotropic two-dimensional sheets were not predictive of the behavior in the whole heart. In 2-D the four cell models exhibited stable reentry, meandering spiral waves, and spiral-wave breakup. In the whole heart with fine structure, all simulation results exhibited complex dynamics reminiscent of fibrillation observed experimentally. In the whole heart without fine structure, anisotropy acted to destabilize filament dynamics although the number of filaments was reduced compared to the heart with structure. In addition, in isotropic hearts without structure the two cell models that exhibited meandering spiral waves in 2-D, stabilized into figure-of-eight surface patterns. We also studied the sensitivity of filament dynamics to computer system configuration and initial conditions. After large simulation times, different macroscopic results sometimes occurred across different system configurations, likely due to a lack of bitwise reproducibility. The study conclusions were insensitive to initial condition perturbations, however, the exact number of filaments over time and their trends were altered by these changes. In summary, we present the following new results. First, we provide a new cell model that resembles the surface patterns of VF in the rabbit heart both qualitatively and quantitatively. Second, filament dynamics in the whole heart cannot be predicted from spiral wave dynamics in 2-D and we identified anisotropy as one destabilizing factor. Third, the exact dynamics of filaments are sensitive to a variety of factors, so we suggest caution in their interpretation and their quantitative analyses

    On the scalability of CFD tool for supersonic jet flow configurations

    Get PDF
    New regulations are imposing noise emissions limitations for the aviation industry which are pushing researchers and engineers to invest efforts in studying the aeroacoustics phenomena. Following this trend, an in-house computational fluid dynamics tool is build to reproduce high fidelity results of supersonic jet flows for aeroacoustic analogy applications. The solver is written using the large eddy simulation formulation that is discretized using a finite difference approach and an explicit time integration. Numerical simulations of supersonic jet flows are very expensive and demand efficient high-performance computing. Therefore, non-blocking message passage interface protocols and parallel Input/Output features are implemented into the code in order to perform simulations which demand up to one billion grid points. The present work addresses the evaluation of code improvements along with the computational performance of the solver running on a computer with maximum theoretical peak of 2.727 PFlops. Different mesh configurations, whose size varies from a few hundred thousand to approximately one billion grid points, are evaluated in the present paper. Calculations are performed using different workloads in order to assess the strong and weak scalability of the parallel computational tool. Moreover, validation results of a realistic flow condition are also presented in the current work

    On the Reproducibility of MPI Reduction Operations

    No full text
    corecore