635 research outputs found
Learning-based runtime management of energy-efficient and reliable many-core systems
This paper highlights and demonstrates our research works to date addressing the energy-efficiency and reliability challenges of many-core systems through intelligent runtime management algorithms. The algorithms are implemented through cross-layer interactions between the three layers: application, runtime and hardware, forming our core theme of working together. The annotated application tasks communicate the performance, energy or reliability requirements to the runtime. With such requirements, the runtime exercises the hardware through various control knobs and gets the feedback of these controls through the performance monitors. The aim is to learn the best possible hardware controls during runtime to achieve energy-efficiency and improved reliability, while meeting the specified application requirements
ATK-ForceField: A New Generation Molecular Dynamics Software Package
ATK-ForceField is a software package for atomistic simulations using
classical interatomic potentials. It is implemented as a part of the Atomistix
ToolKit (ATK), which is a Python programming environment that makes it easy to
create and analyze both standard and highly customized simulations. This paper
will focus on the atomic interaction potentials, molecular dynamics, and
geometry optimization features of the software, however, many more advanced
modeling features are available. The implementation details of these algorithms
and their computational performance will be shown. We present three
illustrative examples of the types of calculations that are possible with
ATK-ForceField: modeling thermal transport properties in a silicon germanium
crystal, vapor deposition of selenium molecules on a selenium surface, and a
simulation of creep in a copper polycrystal.Comment: 28 pages, 9 figure
Invasive compute balancing for applications with shared and hybrid parallelization
This is the author manuscript. The final version is available from the publisher via the DOI in this record.Achieving high scalability with dynamically adaptive algorithms in high-performance computing (HPC) is a non-trivial task. The invasive paradigm using compute migration represents an efficient alternative to classical data migration approaches for such algorithms in HPC. We present a core-distribution scheduler which realizes the migration of computational power by distributing the cores depending on the requirements specified by one or more parallel program instances. We validate our approach with different benchmark suites for simulations with artificial workload as well as applications based on dynamically adaptive shallow water simulations, and investigate concurrently executed adaptivity parameter studies on realistic Tsunami simulations. The invasive approach results in significantly faster overall execution times and higher hardware utilization than alternative approaches. A dynamic resource management is therefore mandatory for a more efficient execution of scenarios similar to our simulations, e.g. several Tsunami simulations in urgent computing, to overcome strong scalability challenges in the area of HPC. The optimizations obtained by invasive migration of cores can be generalized to similar classes of algorithms with dynamic resource requirements.This work was supported by the German Research Foundation (DFG) as part
of the Transregional Collaborative Research Centre ”Invasive Computing”
(SFB/TR 89)
A Parallel Iterative Method for Computing Molecular Absorption Spectra
We describe a fast parallel iterative method for computing molecular
absorption spectra within TDDFT linear response and using the LCAO method. We
use a local basis of "dominant products" to parametrize the space of orbital
products that occur in the LCAO approach. In this basis, the dynamical
polarizability is computed iteratively within an appropriate Krylov subspace.
The iterative procedure uses a a matrix-free GMRES method to determine the
(interacting) density response. The resulting code is about one order of
magnitude faster than our previous full-matrix method. This acceleration makes
the speed of our TDDFT code comparable with codes based on Casida's equation.
The implementation of our method uses hybrid MPI and OpenMP parallelization in
which load balancing and memory access are optimized. To validate our approach
and to establish benchmarks, we compute spectra of large molecules on various
types of parallel machines.
The methods developed here are fairly general and we believe they will find
useful applications in molecular physics/chemistry, even for problems that are
beyond TDDFT, such as organic semiconductors, particularly in photovoltaics.Comment: 20 pages, 17 figures, 3 table
IllinoisGRMHD: An Open-Source, User-Friendly GRMHD Code for Dynamical Spacetimes
In the extreme violence of merger and mass accretion, compact objects like
black holes and neutron stars are thought to launch some of the most luminous
outbursts of electromagnetic and gravitational wave energy in the Universe.
Modeling these systems realistically is a central problem in theoretical
astrophysics, but has proven extremely challenging, requiring the development
of numerical relativity codes that solve Einstein's equations for the
spacetime, coupled to the equations of general relativistic (ideal)
magnetohydrodynamics (GRMHD) for the magnetized fluids. Over the past decade,
the Illinois Numerical Relativity (ILNR) Group's dynamical spacetime GRMHD code
has proven itself as a robust and reliable tool for theoretical modeling of
such GRMHD phenomena. However, the code was written "by experts and for
experts" of the code, with a steep learning curve that would severely hinder
community adoption if it were open-sourced. Here we present IllinoisGRMHD,
which is an open-source, highly-extensible rewrite of the original
closed-source GRMHD code of the ILNR Group. Reducing the learning curve was the
primary focus of this rewrite, with the goal of facilitating community
involvement in the code's use and development, as well as the minimization of
human effort in generating new science. IllinoisGRMHD also saves computer time,
generating roundoff-precision identical output to the original code on
adaptive-mesh grids, but nearly twice as fast at scales of hundreds to
thousands of cores.Comment: 37 pages, 6 figures, single column. Matches published versio
Daubechies Wavelets for Linear Scaling Density Functional Theory
We demonstrate that Daubechies wavelets can be used to construct a minimal
set of optimized localized contracted basis functions in which the Kohn-Sham
orbitals can be represented with an arbitrarily high, controllable precision.
Ground state energies and the forces acting on the ions can be calculated in
this basis with the same accuracy as if they were calculated directly in a
Daubechies wavelets basis, provided that the amplitude of these contracted
basis functions is sufficiently small on the surface of the localization
region, which is guaranteed by the optimization procedure described in this
work. This approach reduces the computational costs of DFT calculations, and
can be combined with sparse matrix algebra to obtain linear scaling with
respect to the number of electrons in the system. Calculations on systems of
10,000 atoms or more thus become feasible in a systematic basis set with
moderate computational resources. Further computational savings can be achieved
by exploiting the similarity of the contracted basis functions for closely
related environments, e.g. in geometry optimizations or combined calculations
of neutral and charged systems
- …