158 research outputs found

    ACiS: smart switches with application-level acceleration

    Full text link
    Network performance has contributed fundamentally to the growth of supercomputing over the past decades. In parallel, High Performance Computing (HPC) peak performance has depended, first, on ever faster/denser CPUs, and then, just on increasing density alone. As operating frequency, and now feature size, have levelled off, two new approaches are becoming central to achieving higher net performance: configurability and integration. Configurability enables hardware to map to the application, as well as vice versa. Integration enables system components that have generally been single function-e.g., a network to transport data—to have additional functionality, e.g., also to operate on that data. More generally, integration enables compute-everywhere: not just in CPU and accelerator, but also in network and, more specifically, the communication switches. In this thesis, we propose four novel methods of enhancing HPC performance through Advanced Computing in the Switch (ACiS). More specifically, we propose various flexible and application-aware accelerators that can be embedded into or attached to existing communication switches to improve the performance and scalability of HPC and Machine Learning (ML) applications. We follow a modular design discipline through introducing composable plugins to successively add ACiS capabilities. In the first work, we propose an inline accelerator to communication switches for user-definable collective operations. MPI collective operations can often be performance killers in HPC applications; we seek to solve this bottleneck by offloading them to reconfigurable hardware within the switch itself. We also introduce a novel mechanism that enables the hardware to support MPI communicators of arbitrary shape and that is scalable to very large systems. In the second work, we propose a look-aside accelerator for communication switches that is capable of processing packets at line-rate. Functions requiring loops and states are addressed in this method. The proposed in-switch accelerator is based on a RISC-V compatible Coarse Grained Reconfigurable Arrays (CGRAs). To facilitate usability, we have developed a framework to compile user-provided C/C++ codes to appropriate back-end instructions for configuring the accelerator. In the third work, we extend ACiS to support fused collectives and the combining of collectives with map operations. We observe that there is an opportunity of fusing communication (collectives) with computation. Since the computation can vary for different applications, ACiS support should be programmable in this method. In the fourth work, we propose that switches with ACiS support can control and manage the execution of applications, i.e., that the switch be an active device with decision-making capabilities. Switches have a central view of the network; they can collect telemetry information and monitor application behavior and then use this information for control, decision-making, and coordination of nodes. We evaluate the feasibility of ACiS through extensive RTL-based simulation as well as deployment in an open-access cloud infrastructure. Using this simulation framework, when considering a Graph Convolutional Network (GCN) application as a case study, a speedup of on average 3.4x across five real-world datasets is achieved on 24 nodes compared to a CPU cluster without ACiS capabilities

    Numerical simulation of combustion instability: flame thickening and boundary conditions

    Get PDF
    Combustion-driven instabilities are a significant barrier for progress for many avenues of immense practical relevance in engineering devices, such as next generation gas turbines geared towards minimising pollutant emissions being susceptible to thermoacoustic instabilities. Numerical simulations of such reactive systems must try to balance a dynamic interplay between cost, complexity, and retention of system physics. As such, new computational tools of relevance to Large Eddy Simulation (LES) of compressible, reactive flows are proposed and evaluated. High order flow solvers are susceptible to spurious noise generation at boundaries which can be very detrimental for combustion simulations. Therefore Navier-Stokes Characteristic Boundary conditions are also reviewed and an extension to axisymmetric configurations proposed. Limitations and lingering open questions in the field are highlighted. A modified Artificially Thickened Flame (ATF) model coupled with a novel dynamic formulation is shown to preserve flame-turbulence interaction across a wide range of canonical configurations. The approach does not require efficiency functions which can be difficult to determine, impact accuracy and have limited regimes of validity. The method is supplemented with novel reverse transforms and scaling laws for relevant post-processing from the thickened to unthickened state. This is implemented into a wider Adaptive Mesh Refinement (AMR) context to deliver a unified LES-AMR-ATF framework. The model is validated in a range of test case showing noticeable improvements over conventional LES alternatives. The proposed modifications allow meaningful inferences about flame structure that conventionally may have been restricted to the domain of Direct Numerical Simulation. This allows studying the changes in small-scale flow and scalar topologies during flame-flame interaction. The approach is applied to a dual flame burner setup, where simulations show inclusion of a neighbouring burner increases compressive flow topologies as compared to a lone flame. This may lead to favouring convex scalar structures that are potentially responsible for the increase in counter-normal flame-flame interactions observed in experiments.Open Acces

    Machine Learning and Its Application to Reacting Flows

    Get PDF
    This open access book introduces and explains machine learning (ML) algorithms and techniques developed for statistical inferences on a complex process or system and their applications to simulations of chemically reacting turbulent flows. These two fields, ML and turbulent combustion, have large body of work and knowledge on their own, and this book brings them together and explain the complexities and challenges involved in applying ML techniques to simulate and study reacting flows. This is important as to the world’s total primary energy supply (TPES), since more than 90% of this supply is through combustion technologies and the non-negligible effects of combustion on environment. Although alternative technologies based on renewable energies are coming up, their shares for the TPES is are less than 5% currently and one needs a complete paradigm shift to replace combustion sources. Whether this is practical or not is entirely a different question, and an answer to this question depends on the respondent. However, a pragmatic analysis suggests that the combustion share to TPES is likely to be more than 70% even by 2070. Hence, it will be prudent to take advantage of ML techniques to improve combustion sciences and technologies so that efficient and “greener” combustion systems that are friendlier to the environment can be designed. The book covers the current state of the art in these two topics and outlines the challenges involved, merits and drawbacks of using ML for turbulent combustion simulations including avenues which can be explored to overcome the challenges. The required mathematical equations and backgrounds are discussed with ample references for readers to find further detail if they wish. This book is unique since there is not any book with similar coverage of topics, ranging from big data analysis and machine learning algorithm to their applications for combustion science and system design for energy generation

    Remnants of compact binary mergers and next-generation numerical relativity codes

    Get PDF
    Numerical relativity (NR) simulations are crucial for studying the coalescence of compact binaries. Based on NR data, we produce a model for the mass and spin of the remnant black hole (BH) for the coalescence of black hole-neutron star systems, discussing its crucial role in gravitational wave (GW) modeling and in the parameter estimation of the two signals GW200105 and GW200115. In the context of binary neutron star merger simulations, we perform the first systematic study comparing results obtained with various neutrino treatments, the presence of turbulent viscosity and different grid resolutions. We find that the time of BH formation after merger is heavily affected by grid resolution and turbulent viscosity. An early BH formation limits matter ejection from the accretion disc, as the BH swallows a significant portion of it. Our results indicate that more reliable kilonova light curves are obtained only if the various ejecta components are present. Moreover, robust r-process nucleosynthesis yields require inclusion of both neutrino emission and reabsorption in simulations. Advanced neutrino schemes and turbulent viscosity in simulations resolved beyond current standards appear necessary for reliable astrophysical predictions. To carry out computationally demanding simulations of growing complexity, next-generation NR codes that can efficiently leverage the latest pre-exascale many-core and heterogeneous infrastructures are required. To this end we develop GR-Athena++, a new dynamical spacetime solver built on top of Athena++, that shows high-order convergence properties and excellent parallel scalability up to O(105) cores in full 3D binary black hole (BBH) merger simulations. Finally we present GR-AthenaK, the first performance-portable spacetime solver, obtained by refactoring GR-Athena++ with the Kokkos programming model. We demonstrate the correctness and convergence properties of GR-AthenaK with BBH runs on GPUs. GR-AthenaK shows a speedup ∌50 on one GPU compared to GR-Athena++ on a single CPU core

    Classical and reactive molecular dynamics: Principles and applications in combustion and energy systems

    Get PDF
    Molecular dynamics (MD) has evolved into a ubiquitous, versatile and powerful computational method for fundamental research in science branches such as biology, chemistry, biomedicine and physics over the past 60 years. Powered by rapidly advanced supercomputing technologies in recent decades, MD has entered the engineering domain as a first-principle predictive method for material properties, physicochemical processes, and even as a design tool. Such developments have far-reaching consequences, and are covered for the first time in the present paper, with a focus on MD for combustion and energy systems encompassing topics like gas/liquid/solid fuel oxidation, pyrolysis, catalytic combustion, heterogeneous combustion, electrochemistry, nanoparticle synthesis, heat transfer, phase change, and fluid mechanics. First, the theoretical framework of the MD methodology is described systemically, covering both classical and reactive MD. The emphasis is on the development of the reactive force field (ReaxFF) MD, which enables chemical reactions to be simulated within the MD framework, utilizing quantum chemistry calculations and/or experimental data for the force field training. Second, details of the numerical methods, boundary conditions, post-processing and computational costs of MD simulations are provided. This is followed by a critical review of selected applications of classical and reactive MD methods in combustion and energy systems. It is demonstrated that the ReaxFF MD has been successfully deployed to gain fundamental insights into pyrolysis and/or oxidation of gas/liquid/solid fuels, revealing detailed energy changes and chemical pathways. Moreover, the complex physico-chemical dynamic processes in catalytic reactions, soot formation, and flame synthesis of nanoparticles are made plainly visible from an atomistic perspective. Flow, heat transfer and phase change phenomena are also scrutinized by MD simulations. Unprecedented details of nanoscale processes such as droplet collision, fuel droplet evaporation, and CO2 capture and storage under subcritical and supercritical conditions are examined at the atomic level. Finally, the outlook for atomistic simulations of combustion and energy systems is discussed in the context of emerging computing platforms, machine learning and multiscale modelling

    LIPIcs, Volume 277, GIScience 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 277, GIScience 2023, Complete Volum

    12th International Conference on Geographic Information Science: GIScience 2023, September 12–15, 2023, Leeds, UK

    Get PDF
    No abstract available

    Two decades of Martini:Better beads, broader scope

    Get PDF
    The Martini model, a coarse-grained force field for molecular dynamics simulations, has been around for nearly two decades. Originally developed for lipid-based systems by the groups of Marrink and Tieleman, the Martini model has over the years been extended as a community effort to the current level of a general-purpose force field. Apart from the obvious benefit of a reduction in computational cost, the popularity of the model is largely due to the systematic yet intuitive building-block approach that underlies the model, as well as the open nature of the development and its continuous validation. The easy implementation in the widely used Gromacs software suite has also been instrumental. Since its conception in 2002, the Martini model underwent a gradual refinement of the bead interactions and a widening scope of applications. In this review, we look back at this development, culminating with the release of the Martini 3 version in 2021. The power of the model is illustrated with key examples of recent important findings in biological and material sciences enabled with Martini, as well as examples from areas where coarse-grained resolution is essential, namely high-throughput applications, systems with large complexity, and simulations approaching the scale of whole cells. This article is categorized under: Software > Molecular Modeling Molecular and Statistical Mechanics > Molecular Dynamics and Monte-Carlo Methods Structure and Mechanism > Computational Materials Science Structure and Mechanism > Computational Biochemistry and Biophysics

    Task-based Runtime Optimizations Towards High Performance Computing Applications

    Get PDF
    The last decades have witnessed a rapid improvement of computational capabilities in high-performance computing (HPC) platforms thanks to hardware technology scaling. HPC architectures benefit from mainstream advances on the hardware with many-core systems, deep hierarchical memory subsystem, non-uniform memory access, and an ever-increasing gap between computational power and memory bandwidth. This has necessitated continuous adaptations across the software stack to maintain high hardware utilization. In this HPC landscape of potentially million-way parallelism, task-based programming models associated with dynamic runtime systems are becoming more popular, which fosters developers’ productivity at extreme scale by abstracting the underlying hardware complexity. In this context, this dissertation highlights how a software bundle powered by a task-based programming model can address the heterogeneous workloads engendered by HPC applications., i.e., data redistribution, geospatial modeling and 3D unstructured mesh deformation here. Data redistribution aims to reshuffle data to optimize some objective for an algorithm, whose objective can be multi-dimensional, such as improving computational load balance or decreasing communication volume or cost, with the ultimate goal of increasing the efficiency and therefore reducing the time-to-solution for the algorithm. Geostatistical modeling, one of the prime motivating applications for exascale computing, is a technique for predicting desired quantities from geographically distributed data, based on statistical models and optimization of parameters. Meshing the deformable contour of moving 3D bodies is an expensive operation that can cause huge computational challenges in fluid-structure interaction (FSI) applications. Therefore, in this dissertation, Redistribute-PaRSEC, ExaGeoStat-PaRSEC and HiCMA-PaRSEC are proposed to efficiently tackle these HPC applications respectively at extreme scale, and they are evaluated on multiple HPC clusters, including AMD-based, Intel-based, Arm-based CPU systems and IBM-based multi-GPU system. This multidisciplinary work emphasizes the need for runtime systems to go beyond their primary responsibility of task scheduling on massively parallel hardware system for servicing the next-generation scientific applications

    Review of Particle Physics

    Get PDF
    The Review summarizes much of particle physics and cosmology. Using data from previous editions, plus 2,143 new measurements from 709 papers, we list, evaluate, and average measured properties of gauge bosons and the recently discovered Higgs boson, leptons, quarks, mesons, and baryons. We summarize searches for hypothetical particles such as supersymmetric particles, heavy bosons, axions, dark photons, etc. Particle properties and search limits are listed in Summary Tables. We give numerous tables, figures, formulae, and reviews of topics such as Higgs Boson Physics, Supersymmetry, Grand Unified Theories, Neutrino Mixing, Dark Energy, Dark Matter, Cosmology, Particle Detectors, Colliders, Probability and Statistics. Among the 120 reviews are many that are new or heavily revised, including a new review on Machine Learning, and one on Spectroscopy of Light Meson Resonances. The Review is divided into two volumes. Volume 1 includes the Summary Tables and 97 review articles. Volume 2 consists of the Particle Listings and contains also 23 reviews that address specific aspects of the data presented in the Listings. The complete Review (both volumes) is published online on the website of the Particle Data Group (pdg.lbl.gov) and in a journal. Volume 1 is available in print as the PDG Book. A Particle Physics Booklet with the Summary Tables and essential tables, figures, and equations from selected review articles is available in print, as a web version optimized for use on phones, and as an Android app.United States Department of Energy (DOE) DE-AC02-05CH11231government of Japan (Ministry of Education, Culture, Sports, Science and Technology)Istituto Nazionale di Fisica Nucleare (INFN)Physical Society of Japan (JPS)European Laboratory for Particle Physics (CERN)United States Department of Energy (DOE
    • 

    corecore