24,522 research outputs found
Analyzing and Modeling the Performance of the HemeLB Lattice-Boltzmann Simulation Environment
We investigate the performance of the HemeLB lattice-Boltzmann simulator for
cerebrovascular blood flow, aimed at providing timely and clinically relevant
assistance to neurosurgeons. HemeLB is optimised for sparse geometries,
supports interactive use, and scales well to 32,768 cores for problems with ~81
million lattice sites. We obtain a maximum performance of 29.5 billion site
updates per second, with only an 11% slowdown for highly sparse problems (5%
fluid fraction). We present steering and visualisation performance measurements
and provide a model which allows users to predict the performance, thereby
determining how to run simulations with maximum accuracy within time
constraints.Comment: Accepted by the Journal of Computational Science. 33 pages, 16
figures, 7 table
Multi-Architecture Monte-Carlo (MC) Simulation of Soft Coarse-Grained Polymeric Materials: SOft coarse grained Monte-carlo Acceleration (SOMA)
Multi-component polymer systems are important for the development of new
materials because of their ability to phase-separate or self-assemble into
nano-structures. The Single-Chain-in-Mean-Field (SCMF) algorithm in conjunction
with a soft, coarse-grained polymer model is an established technique to
investigate these soft-matter systems. Here we present an im- plementation of
this method: SOft coarse grained Monte-carlo Accelera- tion (SOMA). It is
suitable to simulate large system sizes with up to billions of particles, yet
versatile enough to study properties of different kinds of molecular
architectures and interactions. We achieve efficiency of the simulations
commissioning accelerators like GPUs on both workstations as well as
supercomputers. The implementa- tion remains flexible and maintainable because
of the implementation of the scientific programming language enhanced by
OpenACC pragmas for the accelerators. We present implementation details and
features of the program package, investigate the scalability of our
implementation SOMA, and discuss two applications, which cover system sizes
that are difficult to reach with other, common particle-based simulation
methods
ParMooN - a modernized program package based on mapped finite elements
{\sc ParMooN} is a program package for the numerical solution of elliptic and
parabolic partial differential equations. It inherits the distinct features of
its predecessor {\sc MooNMD} \cite{JM04}: strict decoupling of geometry and
finite element spaces, implementation of mapped finite elements as their
definition can be found in textbooks, and a geometric multigrid preconditioner
with the option to use different finite element spaces on different levels of
the multigrid hierarchy. After having presented some thoughts about in-house
research codes, this paper focuses on aspects of the parallelization for a
distributed memory environment, which is the main novelty of {\sc ParMooN}.
Numerical studies, performed on compute servers, assess the efficiency of the
parallelized geometric multigrid preconditioner in comparison with some
parallel solvers that are available in the library {\sc PETSc}. The results of
these studies give a first indication whether the cumbersome implementation of
the parallelized geometric multigrid method was worthwhile or not.Comment: partly supported by European Union (EU), Horizon 2020, Marie
Sk{\l}odowska-Curie Innovative Training Networks (ITN-EID), MIMESIS, grant
number 67571
Lattice Resistance and Peierls Stress in Finite-size Atomistic Dislocation Simulations
Atomistic computations of the Peierls stress in fcc metals are relatively
scarce. By way of contrast, there are many more atomistic computations for bcc
metals, as well as mixed discrete-continuum computations of the Peierls-Nabarro
type for fcc metals. One of the reasons for this is the low Peierls stresses in
fcc metals. Because atomistic computations of the Peierls stress take place in
finite simulation cells, image forces caused by boundaries must either be
relaxed or corrected for if system size independent results are to be obtained.
One of the approaches that has been developed for treating such boundary forces
is by computing them directly and subsequently subtracting their effects, as
developed by V. B. Shenoy and R. Phillips [Phil. Mag. A, 76 (1997) 367]. That
work was primarily analytic, and limited to screw dislocations and special
symmetric geometries. We extend that work to edge and mixed dislocations, and
to arbitrary two-dimensional geometries, through a numerical finite element
computation. We also describe a method for estimating the boundary forces
directly on the basis of atomistic calculations. We apply these methods to the
numerical measurement of the Peierls stress and lattice resistance curves for a
model aluminum (fcc) system using an embedded-atom potential.Comment: LaTeX 47 pages including 20 figure
Analysis of Incomplete Data and an Intrinsic-Dimension Helly Theorem
The analysis of incomplete data is a long-standing challenge in practical statistics. When, as is typical, data objects are represented by points in R^d , incomplete data objects correspond to affine subspaces (lines or Δ-flats).With this motivation we study the problem of finding the minimum intersection radius r(L) of a set of lines or Δ-flats L: the least r such that there is a ball of radius r intersecting every flat in L. Known algorithms for finding the minimum enclosing ball for a point set (or clustering by several balls) do not easily extend to higher dimensional flats, primarily because “distances” between flats do not satisfy the triangle inequality. In this paper we show how to restore geometry (i.e., a substitute for the triangle inequality) to the problem, through a new analog of Helly’s theorem. This “intrinsic-dimension” Helly theorem states: for any family L of Δ-dimensional convex sets in a Hilbert space, there exist Δ + 2 sets L' ⊆ L such that r(L) ≤ 2r(L'). Based upon this we present
an algorithm that computes a (1+ε)-core set L' ⊆ L, |L'| = O(Δ^4/ε), such that the ball centered at a point c with radius (1 +ε)r(L') intersects every element of L. The running time of the algorithm is O(n^(Δ+1)dpoly(Δ/ε)). For the case of lines or line segments (Δ = 1), the (expected) running time of the algorithm can be improved to O(ndpoly(1/ε)).We note that the size of the core set depends only on the dimension of the input objects and is independent of the input size n and the dimension d of the ambient space
- …