9,149 research outputs found
Exploring the Design Space of Static and Incremental Graph Connectivity Algorithms on GPUs
Connected components and spanning forest are fundamental graph algorithms due
to their use in many important applications, such as graph clustering and image
segmentation. GPUs are an ideal platform for graph algorithms due to their high
peak performance and memory bandwidth. While there exist several GPU
connectivity algorithms in the literature, many design choices have not yet
been explored. In this paper, we explore various design choices in GPU
connectivity algorithms, including sampling, linking, and tree compression, for
both the static as well as the incremental setting. Our various design choices
lead to over 300 new GPU implementations of connectivity, many of which
outperform state-of-the-art. We present an experimental evaluation, and show
that we achieve an average speedup of 2.47x speedup over existing static
algorithms. In the incremental setting, we achieve a throughput of up to 48.23
billion edges per second. Compared to state-of-the-art CPU implementations on a
72-core machine, we achieve a speedup of 8.26--14.51x for static connectivity
and 1.85--13.36x for incremental connectivity using a Tesla V100 GPU
Work‐Efficient Parallel Union‐Find
The incremental graph connectivity (IGC) problem is to maintain a data structure that can quickly answer whether two given vertices in a graph are connected, while allowing more edges to be added to the graph. IGC is a fundamental problem and can be solved efficiently in the sequential setting using a solution to the classical union‐find problem. However, sequential solutions are not sufficient to handle modern‐day large, rapidly‐changing graphs where edge updates arrive at a very high rate. We present the first shared‐memory parallel data structure for union‐find (equivalently, IGC) that is both provably work‐efficient (ie, performs no more work than the best sequential counterpart) and has polylogarithmic parallel depth. We also present a simpler algorithm with slightly worse theoretical properties, but which is easier to implement and has good practical performance. Our experiments on large graph streams with various degree distributions show that it has good practical performance, capable of processing hundreds of millions of edges per second using a 20‐core machine
Parallel Batch-Dynamic Graph Connectivity
In this paper, we study batch parallel algorithms for the dynamic
connectivity problem, a fundamental problem that has received considerable
attention in the sequential setting. The most well known sequential algorithm
for dynamic connectivity is the elegant level-set algorithm of Holm, de
Lichtenberg and Thorup (HDT), which achieves amortized time per
edge insertion or deletion, and time per query. We
design a parallel batch-dynamic connectivity algorithm that is work-efficient
with respect to the HDT algorithm for small batch sizes, and is asymptotically
faster when the average batch size is sufficiently large. Given a sequence of
batched updates, where is the average batch size of all deletions, our
algorithm achieves expected amortized work per
edge insertion and deletion and depth w.h.p. Our algorithm
answers a batch of connectivity queries in expected
work and depth w.h.p. To the best of our knowledge, our algorithm
is the first parallel batch-dynamic algorithm for connectivity.Comment: This is the full version of the paper appearing in the ACM Symposium
on Parallelism in Algorithms and Architectures (SPAA), 201
On the genericity properties in networked estimation: Topology design and sensor placement
In this paper, we consider networked estimation of linear, discrete-time
dynamical systems monitored by a network of agents. In order to minimize the
power requirement at the (possibly, battery-operated) agents, we require that
the agents can exchange information with their neighbors only \emph{once per
dynamical system time-step}; in contrast to consensus-based estimation where
the agents exchange information until they reach a consensus. It can be
verified that with this restriction on information exchange, measurement fusion
alone results in an unbounded estimation error at every such agent that does
not have an observable set of measurements in its neighborhood. To over come
this challenge, state-estimate fusion has been proposed to recover the system
observability. However, we show that adding state-estimate fusion may not
recover observability when the system matrix is structured-rank (-rank)
deficient.
In this context, we characterize the state-estimate fusion and measurement
fusion under both full -rank and -rank deficient system matrices.Comment: submitted for IEEE journal publicatio
- …