4,320 research outputs found
Large scale ab initio calculations based on three levels of parallelization
We suggest and implement a parallelization scheme based on an efficient
multiband eigenvalue solver, called the locally optimal block preconditioned
conjugate gradient LOBPCG method, and using an optimized three-dimensional (3D)
fast Fourier transform (FFT) in the ab initio}plane-wave code ABINIT. In
addition to the standard data partitioning over processors corresponding to
different k-points, we introduce data partitioning with respect to blocks of
bands as well as spatial partitioning in the Fourier space of coefficients over
the plane waves basis set used in ABINIT. This k-points-multiband-FFT
parallelization avoids any collective communications on the whole set of
processors relying instead on one-dimensional communications only. For a single
k-point, super-linear scaling is achieved for up to 100 processors due to an
extensive use of hardware optimized BLAS, LAPACK, and SCALAPACK routines,
mainly in the LOBPCG routine. We observe good performance up to 200 processors.
With 10 k-points our three-way data partitioning results in linear scaling up
to 1000 processors for a practical system used for testing.Comment: 8 pages, 5 figures. Accepted to Computational Material Scienc
Computational fluid dynamics research at the United Technologies Research Center requiring supercomputers
An overview of research activities at the United Technologies Research Center (UTRC) in the area of Computational Fluid Dynamics (CFD) is presented. The requirement and use of various levels of computers, including supercomputers, for the CFD activities is described. Examples of CFD directed toward applications to helicopters, turbomachinery, heat exchangers, and the National Aerospace Plane are included. Helicopter rotor codes for the prediction of rotor and fuselage flow fields and airloads were developed with emphasis on rotor wake modeling. Airflow and airload predictions and comparisons with experimental data are presented. Examples are presented of recent parabolized Navier-Stokes and full Navier-Stokes solutions for hypersonic shock-wave/boundary layer interaction, and hydrogen/air supersonic combustion. In addition, other examples of CFD efforts in turbomachinery Navier-Stokes methodology and separated flow modeling are presented. A brief discussion of the 3-tier scientific computing environment is also presented, in which the researcher has access to workstations, mid-size computers, and supercomputers
ASCOT: solving the kinetic equation of minority particle species in tokamak plasmas
A comprehensive description of methods, suitable for solving the kinetic
equation for fast ions and impurity species in tokamak plasmas using Monte
Carlo approach, is presented. The described methods include Hamiltonian
orbit-following in particle and guiding center phase space, test particle or
guiding center solution of the kinetic equation applying stochastic
differential equations in the presence of Coulomb collisions, neoclassical
tearing modes and Alfv\'en eigenmodes as electromagnetic perturbations relevant
to fast ions, together with plasma flow and atomic reactions relevant to
impurity studies. Applying the methods, a complete reimplementation of the
well-established minority species code ASCOT is carried out as a response both
to the increase in computing power during the last twenty years and to the
weakly structured growth of the code, which has made implementation of
additional models impractical. Also, a benchmark between the previous code and
the reimplementation is accomplished, showing good agreement between the codes.Comment: 13 pages, 9 figures, submitted to Computer Physics Communication
Computational Particle Physics for Event Generators and Data Analysis
High-energy physics data analysis relies heavily on the comparison between
experimental and simulated data as stressed lately by the Higgs search at LHC
and the recent identification of a Higgs-like new boson. The first link in the
full simulation chain is the event generation both for background and for
expected signals. Nowadays event generators are based on the automatic
computation of matrix element or amplitude for each process of interest.
Moreover, recent analysis techniques based on the matrix element likelihood
method assign probabilities for every event to belong to any of a given set of
possible processes. This method originally used for the top mass measurement,
although computing intensive, has shown its power at LHC to extract the new
boson signal from the background.
Serving both needs, the automatic calculation of matrix element is therefore
more than ever of prime importance for particle physics. Initiated in the
eighties, the techniques have matured for the lowest order calculations
(tree-level), but become complex and CPU time consuming when higher order
calculations involving loop diagrams are necessary like for QCD processes at
LHC. New calculation techniques for next-to-leading order (NLO) have surfaced
making possible the generation of processes with many final state particles (up
to 6). If NLO calculations are in many cases under control, although not yet
fully automatic, even higher precision calculations involving processes at
2-loops or more remain a big challenge.
After a short introduction to particle physics and to the related theoretical
framework, we will review some of the computing techniques that have been
developed to make these calculations automatic. The main available packages and
some of the most important applications for simulation and data analysis, in
particular at LHC will also be summarized.Comment: 19 pages, 11 figures, Proceedings of CCP (Conference on Computational
Physics) Oct. 2012, Osaka (Japan) in IOP Journal of Physics: Conference
Serie
A sparse octree gravitational N-body code that runs entirely on the GPU processor
We present parallel algorithms for constructing and traversing sparse octrees
on graphics processing units (GPUs). The algorithms are based on parallel-scan
and sort methods. To test the performance and feasibility, we implemented them
in CUDA in the form of a gravitational tree-code which completely runs on the
GPU.(The code is publicly available at:
http://castle.strw.leidenuniv.nl/software.html) The tree construction and
traverse algorithms are portable to many-core devices which have support for
CUDA or OpenCL programming languages. The gravitational tree-code outperforms
tuned CPU code during the tree-construction and shows a performance improvement
of more than a factor 20 overall, resulting in a processing rate of more than
2.8 million particles per second.Comment: Accepted version. Published in Journal of Computational Physics. 35
pages, 12 figures, single colum
The impact of supercomputers on experimentation: A view from a national laboratory
The relative roles of large scale scientific computers and physical experiments in several science and engineering disciplines are discussed. Increasing dependence on computers is shown to be motivated both by the rapid growth in computer speed and memory, which permits accurate numerical simulation of complex physical phenomena, and by the rapid reduction in the cost of performing a calculation, which makes computation an increasingly attractive complement to experimentation. Computer speed and memory requirements are presented for selected areas of such disciplines as fluid dynamics, aerodynamics, aerothermodynamics, chemistry, atmospheric sciences, astronomy, and astrophysics, together with some examples of the complementary nature of computation and experiment. Finally, the impact of the emerging role of computers in the technical disciplines is discussed in terms of both the requirements for experimentation and the attainment of previously inaccessible information on physical processes
Ludwig: A parallel Lattice-Boltzmann code for complex fluids
This paper describes `Ludwig', a versatile code for the simulation of
Lattice-Boltzmann (LB) models in 3-D on cubic lattices. In fact `Ludwig' is not
a single code, but a set of codes that share certain common routines, such as
I/O and communications. If `Ludwig' is used as intended, a variety of complex
fluid models with different equilibrium free energies are simple to code, so
that the user may concentrate on the physics of the problem, rather than on
parallel computing issues. Thus far, `Ludwig''s main application has been to
symmetric binary fluid mixtures. We first explain the philosophy and structure
of `Ludwig' which is argued to be a very effective way of developing large
codes for academic consortia. Next we elaborate on some parallel implementation
issues such as parallel I/O, and the use of MPI to achieve full portability and
good efficiency on both MPP and SMP systems. Finally, we describe how to
implement generic solid boundaries, and look in detail at the particular case
of a symmetric binary fluid mixture near a solid wall. We present a novel
scheme for the thermodynamically consistent simulation of wetting phenomena, in
the presence of static and moving solid boundaries, and check its performance.Comment: Submitted to Computer Physics Communication
- …