4 research outputs found
Parallel Working-Set Search Structures
In this paper we present two versions of a parallel working-set map on p
processors that supports searches, insertions and deletions. In both versions,
the total work of all operations when the map has size at least p is bounded by
the working-set bound, i.e., the cost of an item depends on how recently it was
accessed (for some linearization): accessing an item in the map with recency r
takes O(1+log r) work. In the simpler version each map operation has O((log
p)^2+log n) span (where n is the maximum size of the map). In the pipelined
version each map operation on an item with recency r has O((log p)^2+log r)
span. (Operations in parallel may have overlapping span; span is additive only
for operations in sequence.)
Both data structures are designed to be used by a dynamic multithreading
parallel program that at each step executes a unit-time instruction or makes a
data structure call. To achieve the stated bounds, the pipelined data structure
requires a weak-priority scheduler, which supports a limited form of 2-level
prioritization. At the end we explain how the results translate to practical
implementations using work-stealing schedulers.
To the best of our knowledge, this is the first parallel implementation of a
self-adjusting search structure where the cost of an operation adapts to the
access sequence. A corollary of the working-set bound is that it achieves work
static optimality: the total work is bounded by the access costs in an optimal
static search tree.Comment: Authors' version of a paper accepted to SPAA 201
Memetic Multilevel Hypergraph Partitioning
Hypergraph partitioning has a wide range of important applications such as
VLSI design or scientific computing. With focus on solution quality, we develop
the first multilevel memetic algorithm to tackle the problem. Key components of
our contribution are new effective multilevel recombination and mutation
operations that provide a large amount of diversity. We perform a wide range of
experiments on a benchmark set containing instances from application areas such
VLSI, SAT solving, social networks, and scientific computing. Compared to the
state-of-the-art hypergraph partitioning tools hMetis, PaToH, and KaHyPar, our
new algorithm computes the best result on almost all instances
Load-Balancing for Parallel Delaunay Triangulations
Computing the Delaunay triangulation (DT) of a given point set in
is one of the fundamental operations in computational geometry.
Recently, Funke and Sanders (2017) presented a divide-and-conquer DT algorithm
that merges two partial triangulations by re-triangulating a small subset of
their vertices - the border vertices - and combining the three triangulations
efficiently via parallel hash table lookups. The input point division should
therefore yield roughly equal-sized partitions for good load-balancing and also
result in a small number of border vertices for fast merging. In this paper, we
present a novel divide-step based on partitioning the triangulation of a small
sample of the input points. In experiments on synthetic and real-world data
sets, we achieve nearly perfectly balanced partitions and small border
triangulations. This almost cuts running time in half compared to
non-data-sensitive division schemes on inputs exhibiting an exploitable
underlying structure.Comment: Short version submitted to EuroPar 201