635 research outputs found

    Learning-based runtime management of energy-efficient and reliable many-core systems

    No full text
    This paper highlights and demonstrates our research works to date addressing the energy-efficiency and reliability challenges of many-core systems through intelligent runtime management algorithms. The algorithms are implemented through cross-layer interactions between the three layers: application, runtime and hardware, forming our core theme of working together. The annotated application tasks communicate the performance, energy or reliability requirements to the runtime. With such requirements, the runtime exercises the hardware through various control knobs and gets the feedback of these controls through the performance monitors. The aim is to learn the best possible hardware controls during runtime to achieve energy-efficiency and improved reliability, while meeting the specified application requirements

    ATK-ForceField: A New Generation Molecular Dynamics Software Package

    Full text link
    ATK-ForceField is a software package for atomistic simulations using classical interatomic potentials. It is implemented as a part of the Atomistix ToolKit (ATK), which is a Python programming environment that makes it easy to create and analyze both standard and highly customized simulations. This paper will focus on the atomic interaction potentials, molecular dynamics, and geometry optimization features of the software, however, many more advanced modeling features are available. The implementation details of these algorithms and their computational performance will be shown. We present three illustrative examples of the types of calculations that are possible with ATK-ForceField: modeling thermal transport properties in a silicon germanium crystal, vapor deposition of selenium molecules on a selenium surface, and a simulation of creep in a copper polycrystal.Comment: 28 pages, 9 figure

    Invasive compute balancing for applications with shared and hybrid parallelization

    Get PDF
    This is the author manuscript. The final version is available from the publisher via the DOI in this record.Achieving high scalability with dynamically adaptive algorithms in high-performance computing (HPC) is a non-trivial task. The invasive paradigm using compute migration represents an efficient alternative to classical data migration approaches for such algorithms in HPC. We present a core-distribution scheduler which realizes the migration of computational power by distributing the cores depending on the requirements specified by one or more parallel program instances. We validate our approach with different benchmark suites for simulations with artificial workload as well as applications based on dynamically adaptive shallow water simulations, and investigate concurrently executed adaptivity parameter studies on realistic Tsunami simulations. The invasive approach results in significantly faster overall execution times and higher hardware utilization than alternative approaches. A dynamic resource management is therefore mandatory for a more efficient execution of scenarios similar to our simulations, e.g. several Tsunami simulations in urgent computing, to overcome strong scalability challenges in the area of HPC. The optimizations obtained by invasive migration of cores can be generalized to similar classes of algorithms with dynamic resource requirements.This work was supported by the German Research Foundation (DFG) as part of the Transregional Collaborative Research Centre ”Invasive Computing” (SFB/TR 89)

    A Parallel Iterative Method for Computing Molecular Absorption Spectra

    Full text link
    We describe a fast parallel iterative method for computing molecular absorption spectra within TDDFT linear response and using the LCAO method. We use a local basis of "dominant products" to parametrize the space of orbital products that occur in the LCAO approach. In this basis, the dynamical polarizability is computed iteratively within an appropriate Krylov subspace. The iterative procedure uses a a matrix-free GMRES method to determine the (interacting) density response. The resulting code is about one order of magnitude faster than our previous full-matrix method. This acceleration makes the speed of our TDDFT code comparable with codes based on Casida's equation. The implementation of our method uses hybrid MPI and OpenMP parallelization in which load balancing and memory access are optimized. To validate our approach and to establish benchmarks, we compute spectra of large molecules on various types of parallel machines. The methods developed here are fairly general and we believe they will find useful applications in molecular physics/chemistry, even for problems that are beyond TDDFT, such as organic semiconductors, particularly in photovoltaics.Comment: 20 pages, 17 figures, 3 table

    IllinoisGRMHD: An Open-Source, User-Friendly GRMHD Code for Dynamical Spacetimes

    Get PDF
    In the extreme violence of merger and mass accretion, compact objects like black holes and neutron stars are thought to launch some of the most luminous outbursts of electromagnetic and gravitational wave energy in the Universe. Modeling these systems realistically is a central problem in theoretical astrophysics, but has proven extremely challenging, requiring the development of numerical relativity codes that solve Einstein's equations for the spacetime, coupled to the equations of general relativistic (ideal) magnetohydrodynamics (GRMHD) for the magnetized fluids. Over the past decade, the Illinois Numerical Relativity (ILNR) Group's dynamical spacetime GRMHD code has proven itself as a robust and reliable tool for theoretical modeling of such GRMHD phenomena. However, the code was written "by experts and for experts" of the code, with a steep learning curve that would severely hinder community adoption if it were open-sourced. Here we present IllinoisGRMHD, which is an open-source, highly-extensible rewrite of the original closed-source GRMHD code of the ILNR Group. Reducing the learning curve was the primary focus of this rewrite, with the goal of facilitating community involvement in the code's use and development, as well as the minimization of human effort in generating new science. IllinoisGRMHD also saves computer time, generating roundoff-precision identical output to the original code on adaptive-mesh grids, but nearly twice as fast at scales of hundreds to thousands of cores.Comment: 37 pages, 6 figures, single column. Matches published versio

    Daubechies Wavelets for Linear Scaling Density Functional Theory

    Get PDF
    We demonstrate that Daubechies wavelets can be used to construct a minimal set of optimized localized contracted basis functions in which the Kohn-Sham orbitals can be represented with an arbitrarily high, controllable precision. Ground state energies and the forces acting on the ions can be calculated in this basis with the same accuracy as if they were calculated directly in a Daubechies wavelets basis, provided that the amplitude of these contracted basis functions is sufficiently small on the surface of the localization region, which is guaranteed by the optimization procedure described in this work. This approach reduces the computational costs of DFT calculations, and can be combined with sparse matrix algebra to obtain linear scaling with respect to the number of electrons in the system. Calculations on systems of 10,000 atoms or more thus become feasible in a systematic basis set with moderate computational resources. Further computational savings can be achieved by exploiting the similarity of the contracted basis functions for closely related environments, e.g. in geometry optimizations or combined calculations of neutral and charged systems
    corecore