43 research outputs found
BioEM: GPU-accelerated computing of Bayesian inference of electron microscopy images
In cryo-electron microscopy (EM), molecular structures are determined from
large numbers of projection images of individual particles. To harness the full
power of this single-molecule information, we use the Bayesian inference of EM
(BioEM) formalism. By ranking structural models using posterior probabilities
calculated for individual images, BioEM in principle addresses the challenge of
working with highly dynamic or heterogeneous systems not easily handled in
traditional EM reconstruction. However, the calculation of these posteriors for
large numbers of particles and models is computationally demanding. Here we
present highly parallelized, GPU-accelerated computer software that performs
this task efficiently. Our flexible formulation employs CUDA, OpenMP, and MPI
parallelization combined with both CPU and GPU computing. The resulting BioEM
software scales nearly ideally both on pure CPU and on CPU+GPU architectures,
thus enabling Bayesian analysis of tens of thousands of images in a reasonable
time. The general mathematical framework and robust algorithms are not limited
to cryo-electron microscopy but can be generalized for electron tomography and
other imaging experiments
A massively parallel semi-Lagrangian solver for the six-dimensional Vlasov-Poisson equation
This paper presents an optimized and scalable semi-Lagrangian solver for the
Vlasov-Poisson system in six-dimensional phase space. Grid-based solvers of the
Vlasov equation are known to give accurate results. At the same time, these
solvers are challenged by the curse of dimensionality resulting in very high
memory requirements, and moreover, requiring highly efficient parallelization
schemes. In this paper, we consider the 6d Vlasov-Poisson problem discretized
by a split-step semi-Lagrangian scheme, using successive 1d interpolations on
1d stripes of the 6d domain. Two parallelization paradigms are compared, a
remapping scheme and a classical domain decomposition approach applied to the
full 6d problem. From numerical experiments, the latter approach is found to be
superior in the massively parallel case in various respects. We address the
challenge of artificial time step restrictions due to the decomposition of the
domain by introducing a blocked one-sided communication scheme for the purely
electrostatic case and a rotating mesh for the case with a constant magnetic
field. In addition, we propose a pipelining scheme that enables to hide the
costs for the halo communication between neighbor processes efficiently behind
useful computation. Parallel scalability on up to 65k processes is demonstrated
for benchmark problems on a supercomputer
The MIGenAS integrated bioinformatics toolkit for web-based sequence analysis
We describe a versatile and extensible integrated bioinformatics toolkit for the analysis of biological sequences over the Internet. The web portal offers convenient interactive access to a growing pool of chainable bioinformatics software tools and databases that are centrally installed and maintained by the RZG. Currently, supported tasks comprise sequence similarity searches in public or user-supplied databases, computation and validation of multiple sequence alignments, phylogenetic analysis and protein–structure prediction. Individual tools can be seamlessly chained into pipelines allowing the user to conveniently process complex workflows without the necessity to take care of any format conversions or tedious parsing of intermediate results. The toolkit is part of the Max-Planck Integrated Gene Analysis System (MIGenAS) of the Max Planck Society available at (click ‘Start Toolkit’)
A Hybrid MPI-OpenMP Parallel Implementation for pseudospectral simulations with application to Taylor-Couette Flow
A hybrid-parallel direct-numerical-simulation method with application to
turbulent Taylor-Couette flow is presented. The Navier-Stokes equations are
discretized in cylindrical coordinates with the spectral Fourier-Galerkin
method in the axial and azimuthal directions, and high-order finite differences
in the radial direction. Time is advanced by a second-order, semi-implicit
projection scheme, which requires the solution of five Helmholtz/Poisson
equations, avoids staggered grids and renders very small slip velocities.
Nonlinear terms are computed with the pseudospectral method. The code is
parallelized using a hybrid MPI-OpenMP strategy, which is simpler to implement,
reduces inter-node communications and is more efficient compared to a flat MPI
parallelization. A strong scaling study shows that the hybrid code maintains
very good scalability up to more than 20000 processor cores and thus allows to
perform simulations at higher resolutions than previously feasible, and opens
up the possibility to simulate turbulent Taylor-Couette flows at Reynolds
numbers up to . This enables to probe hydrodynamic
turbulence in Keplerian flows in experimentally relevant regimes.Comment: 30 pages, 11 figure
An Efficient Particle Tracking Algorithm for Large-Scale Parallel Pseudo-Spectral Simulations of Turbulence
Particle tracking in large-scale numerical simulations of turbulent flows
presents one of the major bottlenecks in parallel performance and scaling
efficiency. Here, we describe a particle tracking algorithm for large-scale
parallel pseudo-spectral simulations of turbulence which scales well up to
billions of tracer particles on modern high-performance computing
architectures. We summarize the standard parallel methods used to solve the
fluid equations in our hybrid MPI/OpenMP implementation. As the main focus, we
describe the implementation of the particle tracking algorithm and document its
computational performance. To address the extensive inter-process communication
required by particle tracking, we introduce a task-based approach to overlap
point-to-point communications with computations, thereby enabling improved
resource utilization. We characterize the computational cost as a function of
the number of particles tracked and compare it with the flow field computation,
showing that the cost of particle tracking is very small for typical
applications
Spherically Symmetric Simulation with Boltzmann Neutrino Transport of Core Collapse and Post-Bounce Evolution of a 15 Solar Mass Star
We present a spherically symmetric, Newtonian core-collapse simulation of a
15 solar mass star with a 1.28 solar mass iron core. The time-, energy-, and
angle-dependent transport of electron neutrinos and antineutrinos was treated
with a new code which iteratively solves the Boltzmann equation and the
equations for neutrino number, energy and momentum to order O(v/c) in the
velocity v of the stellar medium. The supernova shock expands to a maximum
radius of 350 km instead of only about 240 km as in a comparable calculation
with multi-group flux-limited diffusion (MGFLD) by Bruenn, Mezzacappa, & Dineva
(1995). This may be explained by stronger neutrino heating due to the more
accurate transport in our model. Nevertheless, after 180 ms of expansion the
shock finally recedes to a radius around 250 km (compared to about 170 km in
the MGFLD run). The effect of an accurate neutrino transport is helpful, but
not large enough to cause an explosion of the considered 15 solar mass star.
Therefore postshock convection and/or an enhancement of the core neutrino
luminosity by convection or reduced neutrino opacities in the neutron star seem
necessary for neutrino-driven explosions of such stars. We find an electron
fraction Y_e > 0.5 in the neutrino-heated matter, which suggests that the
overproduction problem of neutron-rich nuclei with mass numbers around A = 90
in exploding models may be absent when a Boltzmann solver is used for the
electron neutrino and antineutrino transport.Comment: 6 pages, LaTex, 3 encapsulated postscript figures, revised and
shortened version. Astrophys. J., Letters, accepte
Electron inertia effects in 3D hybrid-kinetic collisionless plasma turbulence
The effects of the electron inertia on the current sheets that are formed out
of kinetic turbulence are relevant to understand the importance of coherent
structures in turbulence and the nature of turbulence at the dissipation
scales. We investigate this problem by carrying out 3D hybrid-kinetic
Particle-in-Cell (PIC) simulations of decaying kinetic turbulence with our
CHIEF code. The main distinguishing feature of this code is an implementation
of the electron inertia without approximations. Our simulation results show
that the electron inertia plays an important role in regulating and limiting
the largest values of current density in both real and wavenumber Fourier
space, in particular near and, unexpectedly, even above electron scales. In
addition, the electric field associated to the electron inertia dominates most
of the strongest current sheets. The electron inertia is thus important to
accurately describe the properties of current sheets formed in turbulence at
electron scales.Comment: 34 pages, 10 figures. Revised version. Published in Physics of
Plasma