Search CORE

24,387 research outputs found

Optimized Block-Based Algorithms to Label Connected Components on GPUs

Author: Costantino Grana
Federico Bolelli
Stefano Allegretti
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Connected Components Labeling (CCL) is a crucial step of several image processing and computer vision pipelines. Many efficient sequential strategies exist, among which one of the most effective is the use of a block-based mask to drastically cut the number of memory accesses. In the last decade, aided by the fast development of Graphics Processing Units (GPUs), a lot of data parallel CCL algorithms have been proposed along with sequential ones. Applications that entirely run in GPU can benefit from parallel implementations of CCL that allow to avoid expensive memory transfers between host and device. In this paper, two new eight-connectivity CCL algorithms are proposed, namely Block-based Union Find (BUF) and Block-based Komura Equivalence (BKE). These algorithms optimize existing GPU solutions introducing a block-based approach. Extensions for three-dimensional datasets are also discussed. In order to produce a fair comparison with previously proposed alternatives, YACCLAB, a public CCL benchmarking framework, has been extended and made suitable for evaluating also GPU algorithms. Moreover, three-dimensional datasets have been added to its collection. Experimental results on real cases and synthetically generated datasets demonstrate the superiority of the new proposals with respect to state-of-the-art, both on 2D and 3D scenarios

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

How to Extract the Geometry and Topology from Very Large 3D Segmentations

Author: Andres Bjoern
Hamprecht Fred A.
Koethe Ullrich
Kroeger Thorben
Publication venue
Publication date: 01/01/2010
Field of study

Segmentation is often an essential intermediate step in image analysis. A volume segmentation characterizes the underlying volume image in terms of geometric information--segments, faces between segments, curves in which several faces meet--as well as a topology on these objects. Existing algorithms encode this information in designated data structures, but require that these data structures fit entirely in Random Access Memory (RAM). Today, 3D images with several billion voxels are acquired, e.g. in structural neurobiology. Since these large volumes can no longer be processed with existing methods, we present a new algorithm which performs geometry and topology extraction with a runtime linear in the number of voxels and log-linear in the number of faces and curves. The parallelizable algorithm proceeds in a block-wise fashion and constructs a consistent representation of the entire volume image on the hard drive, making the structure of very large volume segmentations accessible to image analysis. The parallelized C++ source code, free command line tools and MATLAB mex files are avilable from http://hci.iwr.uni-heidelberg.de/software.phpComment: C++ source code, free command line tools and MATLAB mex files are avilable from http://hci.iwr.uni-heidelberg.de/software.ph

arXiv.org e-Print Archive

MPG.PuRe

Connected component identification and cluster update on GPU

Author: D. A. Bader
D. B. Kirk
D. Stauffer
J. E. Gentle
K. Binder
Martin Weigel
R. J. Baxter
T. H. Cormen
Publication venue: 'American Physical Society (APS)'
Publication date: 12/06/2011
Field of study

Cluster identification tasks occur in a multitude of contexts in physics and engineering such as, for instance, cluster algorithms for simulating spin models, percolation simulations, segmentation problems in image processing, or network analysis. While it has been shown that graphics processing units (GPUs) can result in speedups of two to three orders of magnitude as compared to serial codes on CPUs for the case of local and thus naturally parallelized problems such as single-spin flip update simulations of spin models, the situation is considerably more complicated for the non-local problem of cluster or connected component identification. I discuss the suitability of different approaches of parallelization of cluster labeling and cluster update algorithms for calculations on GPU and compare to the performance of serial implementations.Comment: 15 pages, 14 figures, one table, submitted to PR

arXiv.org e-Print Archive

Crossref

Coventry University Pure Portal

Implicit Decomposition for Write-Efficient Connectivity Algorithms

Author: Ben-David Naama
Blelloch Guy E.
Fineman Jeremy T.
Gibbons Phillip B.
Gu Yan
McGuffey Charles
Shun Julian
Publication venue
Publication date: 07/10/2017
Field of study

The future of main memory appears to lie in the direction of new technologies that provide strong capacity-to-performance ratios, but have write operations that are much more expensive than reads in terms of latency, bandwidth, and energy. Motivated by this trend, we propose sequential and parallel algorithms to solve graph connectivity problems using significantly fewer writes than conventional algorithms. Our primary algorithmic tool is the construction of an

o(n)

-sized "implicit decomposition" of a bounded-degree graph

G

n

nodes, which combined with read-only access to

G

enables fast answers to connectivity and biconnectivity queries on

G

. The construction breaks the linear-write "barrier", resulting in costs that are asymptotically lower than conventional algorithms while adding only a modest cost to querying time. For general non-sparse graphs on

m

edges, we also provide the first

o(m)

writes and

O(m)

operations parallel algorithms for connectivity and biconnectivity. These algorithms provide insight into how applications can efficiently process computations on large graphs in systems with read-write asymmetry

arXiv.org e-Print Archive

Crossref

DSpace@MIT

An Algorithmic Framework for Labeling Road Maps

Author: A Gemsa
E Imhof
F Chirié
G Neyer
S Seibert
Publication venue
Publication date: 13/05/2016
Field of study

Given an unlabeled road map, we consider, from an algorithmic perspective, the cartographic problem to place non-overlapping road labels embedded in their roads. We first decompose the road network into logically coherent road sections, e.g., parts of roads between two junctions. Based on this decomposition, we present and implement a new and versatile framework for placing labels in road maps such that the number of labeled road sections is maximized. In an experimental evaluation with road maps of 11 major cities we show that our proposed labeling algorithm is both fast in practice and that it reaches near-optimal solution quality, where optimal solutions are obtained by mixed-integer linear programming. In comparison to the standard OpenStreetMap renderer Mapnik, our algorithm labels 31% more road sections in average.Comment: extended version of a paper to appear at GIScience 201

arXiv.org e-Print Archive

Crossref

OBDD-Based Representation of Interval Graphs

Author: B. Bollig
B. Bollig
B. Bollig
B. Bollig
B. Bollig
C. Meinel
D. Sawitzki
D. Sawitzki
D. Sawitzki
D. Sieling
D.B. Arnold
G.B. Mertzios
G.D. Hachtel
J.R. Burch
K. Hosaka
K. Meer
P. Woelfel
R. Bloem
R. Gentilini
R. Nunkesser
R.E. Bryant
T. Saitoh
Y.-T. Lai
Publication venue
Publication date: 01/01/2013
Field of study

A graph

G = (V,E)

can be described by the characteristic function of the edge set

\chi_E

which maps a pair of binary encoded nodes to 1 iff the nodes are adjacent. Using \emph{Ordered Binary Decision Diagrams} (OBDDs) to store

\chi_E

can lead to a (hopefully) compact representation. Given the OBDD as an input, symbolic/implicit OBDD-based graph algorithms can solve optimization problems by mainly using functional operations, e.g. quantification or binary synthesis. While the OBDD representation size can not be small in general, it can be provable small for special graph classes and then also lead to fast algorithms. In this paper, we show that the OBDD size of unit interval graphs is

O(\ | V \ | /\log \ | V \ |)

and the OBDD size of interval graphs is $O(\ | V \ | \log \ | V \ |)

which both improve a known result from Nunkesser and Woelfel (2009). Furthermore, we can show that using our variable order and node labeling for interval graphs the worst-case OBDD size is

\Omega(\ | V \ | \log \ | V \ |)

. We use the structure of the adjacency matrices to prove these bounds. This method may be of independent interest and can be applied to other graph classes. We also develop a maximum matching algorithm on unit interval graphs using

O(\log \ | V \ |)

operations and a coloring algorithm for unit and general intervals graphs using

O(\log^2 \ | V \ |)$ operations and evaluate the algorithms empirically.Comment: 29 pages, accepted for 39th International Workshop on Graph-Theoretic Concepts 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Fast Computation of Small Cuts via Cycle Space Sampling

Author: Pritchard David
Thurimella Ramakrishna
Publication venue
Publication date: 21/07/2010
Field of study

We describe a new sampling-based method to determine cuts in an undirected graph. For a graph (V, E), its cycle space is the family of all subsets of E that have even degree at each vertex. We prove that with high probability, sampling the cycle space identifies the cuts of a graph. This leads to simple new linear-time sequential algorithms for finding all cut edges and cut pairs (a set of 2 edges that form a cut) of a graph. In the model of distributed computing in a graph G=(V, E) with O(log V)-bit messages, our approach yields faster algorithms for several problems. The diameter of G is denoted by Diam, and the maximum degree by Delta. We obtain simple O(Diam)-time distributed algorithms to find all cut edges, 2-edge-connected components, and cut pairs, matching or improving upon previous time bounds. Under natural conditions these new algorithms are universally optimal --- i.e. a Omega(Diam)-time lower bound holds on every graph. We obtain a O(Diam+Delta/log V)-time distributed algorithm for finding cut vertices; this is faster than the best previous algorithm when Delta, Diam = O(sqrt(V)). A simple extension of our work yields the first distributed algorithm with sub-linear time for 3-edge-connected components. The basic distributed algorithms are Monte Carlo, but they can be made Las Vegas without increasing the asymptotic complexity. In the model of parallel computing on the EREW PRAM our approach yields a simple algorithm with optimal time complexity O(log V) for finding cut pairs and 3-edge-connected components.Comment: Previous version appeared in Proc. 35th ICALP, pages 145--160, 200

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne