518 research outputs found
A Parallel Mesh-Adaptive Framework for Hyperbolic Conservation Laws
We report on the development of a computational framework for the parallel,
mesh-adaptive solution of systems of hyperbolic conservation laws like the
time-dependent Euler equations in compressible gas dynamics or
Magneto-Hydrodynamics (MHD) and similar models in plasma physics. Local mesh
refinement is realized by the recursive bisection of grid blocks along each
spatial dimension, implemented numerical schemes include standard
finite-differences as well as shock-capturing central schemes, both in
connection with Runge-Kutta type integrators. Parallel execution is achieved
through a configurable hybrid of POSIX-multi-threading and MPI-distribution
with dynamic load balancing. One- two- and three-dimensional test computations
for the Euler equations have been carried out and show good parallel scaling
behavior. The Racoon framework is currently used to study the formation of
singularities in plasmas and fluids.Comment: late submissio
Million Atom Electronic Structure and Device Calculations on Peta-Scale Computers
Semiconductor devices are scaled down to the level which constituent
materials are no longer considered continuous. To account for atomistic
randomness, surface effects and quantum mechanical effects, an atomistic
modeling approach needs to be pursued. The Nanoelectronic Modeling Tool (NEMO
3-D) has satisfied the requirement by including emprical and
tight binding models and considering strain to successfully
simulate various semiconductor material systems. Computationally, however, NEMO
3-D needs significant improvements to utilize increasing supply of processors.
This paper introduces the new modeling tool, OMEN 3-D, and discusses the major
computational improvements, the 3-D domain decomposition and the multi-level
parallelism. As a featured application, a full 3-D parallelized
Schr\"odinger-Poisson solver and its application to calculate the bandstructure
of doped phosphorus(P) layer in silicon is demonstrated. Impurity
bands due to the donor ion potentials are computed.Comment: 4 pages, 6 figures, IEEE proceedings of the 13th International
Workshop on Computational Electronics, Tsinghua University, Beijing, May
27-29 200
Reproducibility, accuracy and performance of the Feltor code and library on parallel computer architectures
Feltor is a modular and free scientific software package. It allows
developing platform independent code that runs on a variety of parallel
computer architectures ranging from laptop CPUs to multi-GPU distributed memory
systems. Feltor consists of both a numerical library and a collection of
application codes built on top of the library. Its main target are two- and
three-dimensional drift- and gyro-fluid simulations with discontinuous Galerkin
methods as the main numerical discretization technique. We observe that
numerical simulations of a recently developed gyro-fluid model produce
non-deterministic results in parallel computations. First, we show how we
restore accuracy and bitwise reproducibility algorithmically and
programmatically. In particular, we adopt an implementation of the exactly
rounded dot product based on long accumulators, which avoids accuracy losses
especially in parallel applications. However, reproducibility and accuracy
alone fail to indicate correct simulation behaviour. In fact, in the physical
model slightly different initial conditions lead to vastly different end
states. This behaviour translates to its numerical representation. Pointwise
convergence, even in principle, becomes impossible for long simulation times.
In a second part, we explore important performance tuning considerations. We
identify latency and memory bandwidth as the main performance indicators of our
routines. Based on these, we propose a parallel performance model that predicts
the execution time of algorithms implemented in Feltor and test our model on a
selection of parallel hardware architectures. We are able to predict the
execution time with a relative error of less than 25% for problem sizes between
0.1 and 1000 MB. Finally, we find that the product of latency and bandwidth
gives a minimum array size per compute node to achieve a scaling efficiency
above 50% (both strong and weak)
Advancing nanoelectronic device modeling through peta-scale computing and deployment on nanoHUB
Recent improvements to existing HPC codes NEMO 3-D and OMEN, combined with access to peta-scale computing resources, have enabled realistic device engineering simulations that were previously infeasible. NEMO 3-D can now simulate 1 billion atom systems, and, using 3D spatial decomposition, scale to 32768 cores. Simulation time for the band structure of an experimental P doped Si quantum computing device fell from 40 minutes to I minute. OMEN can perform fully quantum mechanical transport calculations for real-word UTB FETs on 147,456 cores in roughly 5 minutes. Both of these tools power simulation engines on the nanoHUB, giving the community access to previously unavailable research capabilities
Runko: Modern multi-physics toolbox for simulating plasma
Runko is a new open-source plasma simulation framework implemented in C++ and
Python. It is designed to function as an easy-to-extend general toolbox for
simulating astrophysical plasmas with different theoretical and numerical
models. Computationally intensive low-level "kernels" are written in modern
C++14 taking advantage of polymorphic classes, multiple inheritance, and
template metaprogramming. High-level functionality is operated with Python3
scripts. This hybrid program design ensures fast code and ease of use. The
framework has a modular object-oriented design that allow the user to easily
add new numerical algorithms to the system. The code can be run on various
computing platforms ranging from laptops (shared-memory systems) to massively
parallel supercomputer architectures (distributed-memory systems). The
framework also supports heterogeneous multi-physics simulations in which
different physical solvers can be combined and run simultaneously. Here we
report on the first results from the framework's relativistic particle-in-cell
(PIC) module. Using the PIC module, we simulate decaying relativistic kinetic
turbulence in suddenly stirred magnetically-dominated pair plasma. We show that
the resulting particle distribution can be separated into a thermal part that
forms the turbulent cascade and into a separate decoupled non-thermal particle
population that acts as an energy sink for the system.Comment: 17 pages, 6 figures. Comments welcome! Code available from
https://github.com/natj/runk
Roadmap on Electronic Structure Codes in the Exascale Era
Electronic structure calculations have been instrumental in providing many
important insights into a range of physical and chemical properties of various
molecular and solid-state systems. Their importance to various fields,
including materials science, chemical sciences, computational chemistry and
device physics, is underscored by the large fraction of available public
supercomputing resources devoted to these calculations. As we enter the
exascale era, exciting new opportunities to increase simulation numbers, sizes,
and accuracies present themselves. In order to realize these promises, the
community of electronic structure software developers will however first have
to tackle a number of challenges pertaining to the efficient use of new
architectures that will rely heavily on massive parallelism and hardware
accelerators. This roadmap provides a broad overview of the state-of-the-art in
electronic structure calculations and of the various new directions being
pursued by the community. It covers 14 electronic structure codes, presenting
their current status, their development priorities over the next five years,
and their plans towards tackling the challenges and leveraging the
opportunities presented by the advent of exascale computing.Comment: Submitted as a roadmap article to Modelling and Simulation in
Materials Science and Engineering; Address any correspondence to Vikram
Gavini ([email protected]) and Danny Perez ([email protected]
Roadmap on Electronic Structure Codes in the Exascale Era
Electronic structure calculations have been instrumental in providing many important insights into a range of physical and chemical properties of various molecular and solid-state systems. Their importance to various fields, including materials science, chemical sciences, computational chemistry and device physics, is underscored by the large fraction of available public supercomputing resources devoted to these calculations. As we enter the exascale era, exciting new opportunities to increase simulation numbers, sizes, and accuracies present themselves. In order to realize these promises, the community of electronic structure software developers will however first have to tackle a number of challenges pertaining to the efficient use of new architectures that will rely heavily on massive parallelism and hardware accelerators. This roadmap provides a broad overview of the state-of-the-art in electronic structure calculations and of the various new directions being pursued by the community. It covers 14 electronic structure codes, presenting their current status, their development priorities over the next five years, and their plans towards tackling the challenges and leveraging the opportunities presented by the advent of exascale computing
- …