4,699 research outputs found
One machine, one minute, three billion tetrahedra
This paper presents a new scalable parallelization scheme to generate the 3D
Delaunay triangulation of a given set of points. Our first contribution is an
efficient serial implementation of the incremental Delaunay insertion
algorithm. A simple dedicated data structure, an efficient sorting of the
points and the optimization of the insertion algorithm have permitted to
accelerate reference implementations by a factor three. Our second contribution
is a multi-threaded version of the Delaunay kernel that is able to concurrently
insert vertices. Moore curve coordinates are used to partition the point set,
avoiding heavy synchronization overheads. Conflicts are managed by modifying
the partitions with a simple rescaling of the space-filling curve. The
performances of our implementation have been measured on three different
processors, an Intel core-i7, an Intel Xeon Phi and an AMD EPYC, on which we
have been able to compute 3 billion tetrahedra in 53 seconds. This corresponds
to a generation rate of over 55 million tetrahedra per second. We finally show
how this very efficient parallel Delaunay triangulation can be integrated in a
Delaunay refinement mesh generator which takes as input the triangulated
surface boundary of the volume to mesh
Efficient data structures for masks on 2D grids
This article discusses various methods of representing and manipulating
arbitrary coverage information in two dimensions, with a focus on space- and
time-efficiency when processing such coverages, storing them on disk, and
transmitting them between computers. While these considerations were originally
motivated by the specific tasks of representing sky coverage and cross-matching
catalogues of astronomical surveys, they can be profitably applied in many
other situations as well.Comment: accepted by A&
Sequential Quasi-Monte Carlo
We derive and study SQMC (Sequential Quasi-Monte Carlo), a class of
algorithms obtained by introducing QMC point sets in particle filtering. SQMC
is related to, and may be seen as an extension of, the array-RQMC algorithm of
L'Ecuyer et al. (2006). The complexity of SQMC is , where is
the number of simulations at each iteration, and its error rate is smaller than
the Monte Carlo rate . The only requirement to implement SQMC is
the ability to write the simulation of particle given as a
deterministic function of and a fixed number of uniform variates.
We show that SQMC is amenable to the same extensions as standard SMC, such as
forward smoothing, backward smoothing, unbiased likelihood evaluation, and so
on. In particular, SQMC may replace SMC within a PMCMC (particle Markov chain
Monte Carlo) algorithm. We establish several convergence results. We provide
numerical evidence that SQMC may significantly outperform SMC in practical
scenarios.Comment: 55 pages, 10 figures (final version
Reordering Rows for Better Compression: Beyond the Lexicographic Order
Sorting database tables before compressing them improves the compression
rate. Can we do better than the lexicographical order? For minimizing the
number of runs in a run-length encoding compression scheme, the best approaches
to row-ordering are derived from traveling salesman heuristics, although there
is a significant trade-off between running time and compression. A new
heuristic, Multiple Lists, which is a variant on Nearest Neighbor that trades
off compression for a major running-time speedup, is a good option for very
large tables. However, for some compression schemes, it is more important to
generate long runs rather than few runs. For this case, another novel
heuristic, Vortex, is promising. We find that we can improve run-length
encoding up to a factor of 3 whereas we can improve prefix coding by up to 80%:
these gains are on top of the gains due to lexicographically sorting the table.
We prove that the new row reordering is optimal (within 10%) at minimizing the
runs of identical values within columns, in a few cases.Comment: to appear in ACM TOD
Efficiency of linked cell algorithms
The linked cell list algorithm is an essential part of molecular simulation
software, both molecular dynamics and Monte Carlo. Though it scales linearly
with the number of particles, there has been a constant interest in increasing
its efficiency, because a large part of CPU time is spent to identify the
interacting particles. Several recent publications proposed improvements to the
algorithm and investigated their efficiency by applying them to particular
setups. In this publication we develop a general method to evaluate the
efficiency of these algorithms, which is mostly independent of the parameters
of the simulation, and test it for a number of linked cell list algorithms. We
also propose a combination of linked cell reordering and interaction sorting
that shows a good efficiency for a broad range of simulation setups.Comment: Submitted to Computer Physics Communications on 22 December 2009,
still awaiting a referee repor
- âŠ