28,720 research outputs found

    Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable

    Full text link
    There has been significant recent interest in parallel graph processing due to the need to quickly analyze the large graphs available today. Many graph codes have been designed for distributed memory or external memory. However, today even the largest publicly-available real-world graph (the Hyperlink Web graph with over 3.5 billion vertices and 128 billion edges) can fit in the memory of a single commodity multicore server. Nevertheless, most experimental work in the literature report results on much smaller graphs, and the ones for the Hyperlink graph use distributed or external memory. Therefore, it is natural to ask whether we can efficiently solve a broad class of graph problems on this graph in memory. This paper shows that theoretically-efficient parallel graph algorithms can scale to the largest publicly-available graphs using a single machine with a terabyte of RAM, processing them in minutes. We give implementations of theoretically-efficient parallel algorithms for 20 important graph problems. We also present the optimizations and techniques that we used in our implementations, which were crucial in enabling us to process these large graphs quickly. We show that the running times of our implementations outperform existing state-of-the-art implementations on the largest real-world graphs. For many of the problems that we consider, this is the first time they have been solved on graphs at this scale. We have made the implementations developed in this work publicly-available as the Graph-Based Benchmark Suite (GBBS).Comment: This is the full version of the paper appearing in the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), 201

    Optimal randomized incremental construction for guaranteed logarithmic planar point location

    Full text link
    Given a planar map of nn segments in which we wish to efficiently locate points, we present the first randomized incremental construction of the well-known trapezoidal-map search-structure that only requires expected O(nlogn)O(n \log n) preprocessing time while deterministically guaranteeing worst-case linear storage space and worst-case logarithmic query time. This settles a long standing open problem; the best previously known construction time of such a structure, which is based on a directed acyclic graph, so-called the history DAG, and with the above worst-case space and query-time guarantees, was expected O(nlog2n)O(n \log^2 n). The result is based on a deeper understanding of the structure of the history DAG, its depth in relation to the length of its longest search path, as well as its correspondence to the trapezoidal search tree. Our results immediately extend to planar maps induced by finite collections of pairwise interior disjoint well-behaved curves.Comment: The article significantly extends the theoretical aspects of the work presented in http://arxiv.org/abs/1205.543

    Improved Implementation of Point Location in General Two-Dimensional Subdivisions

    Full text link
    We present a major revamp of the point-location data structure for general two-dimensional subdivisions via randomized incremental construction, implemented in CGAL, the Computational Geometry Algorithms Library. We can now guarantee that the constructed directed acyclic graph G is of linear size and provides logarithmic query time. Via the construction of the Voronoi diagram for a given point set S of size n, this also enables nearest-neighbor queries in guaranteed O(log n) time. Another major innovation is the support of general unbounded subdivisions as well as subdivisions of two-dimensional parametric surfaces such as spheres, tori, cylinders. The implementation is exact, complete, and general, i.e., it can also handle non-linear subdivisions. Like the previous version, the data structure supports modifications of the subdivision, such as insertions and deletions of edges, after the initial preprocessing. A major challenge is to retain the expected O(n log n) preprocessing time while providing the above (deterministic) space and query-time guarantees. We describe an efficient preprocessing algorithm, which explicitly verifies the length L of the longest query path in O(n log n) time. However, instead of using L, our implementation is based on the depth D of G. Although we prove that the worst case ratio of D and L is Theta(n/log n), we conjecture, based on our experimental results, that this solution achieves expected O(n log n) preprocessing time.Comment: 21 page

    Shared Memory Parallel Subgraph Enumeration

    Full text link
    The subgraph enumeration problem asks us to find all subgraphs of a target graph that are isomorphic to a given pattern graph. Determining whether even one such isomorphic subgraph exists is NP-complete---and therefore finding all such subgraphs (if they exist) is a time-consuming task. Subgraph enumeration has applications in many fields, including biochemistry and social networks, and interestingly the fastest algorithms for solving the problem for biochemical inputs are sequential. Since they depend on depth-first tree traversal, an efficient parallelization is far from trivial. Nevertheless, since important applications produce data sets with increasing difficulty, parallelism seems beneficial. We thus present here a shared-memory parallelization of the state-of-the-art subgraph enumeration algorithms RI and RI-DS (a variant of RI for dense graphs) by Bonnici et al. [BMC Bioinformatics, 2013]. Our strategy uses work stealing and our implementation demonstrates a significant speedup on real-world biochemical data---despite a highly irregular data access pattern. We also improve RI-DS by pruning the search space better; this further improves the empirical running times compared to the already highly tuned RI-DS.Comment: 18 pages, 12 figures, To appear at the 7th IEEE Workshop on Parallel / Distributed Computing and Optimization (PDCO 2017

    Prospects and limitations of full-text index structures in genome analysis

    Get PDF
    The combination of incessant advances in sequencing technology producing large amounts of data and innovative bioinformatics approaches, designed to cope with this data flood, has led to new interesting results in the life sciences. Given the magnitude of sequence data to be processed, many bioinformatics tools rely on efficient solutions to a variety of complex string problems. These solutions include fast heuristic algorithms and advanced data structures, generally referred to as index structures. Although the importance of index structures is generally known to the bioinformatics community, the design and potency of these data structures, as well as their properties and limitations, are less understood. Moreover, the last decade has seen a boom in the number of variant index structures featuring complex and diverse memory-time trade-offs. This article brings a comprehensive state-of-the-art overview of the most popular index structures and their recently developed variants. Their features, interrelationships, the trade-offs they impose, but also their practical limitations, are explained and compared

    Best of Two Local Models: Local Centralized and Local Distributed Algorithms

    Full text link
    We consider two models of computation: centralized local algorithms and local distributed algorithms. Algorithms in one model are adapted to the other model to obtain improved algorithms. Distributed vertex coloring is employed to design improved centralized local algorithms for: maximal independent set, maximal matching, and an approximation scheme for maximum (weighted) matching over bounded degree graphs. The improvement is threefold: the algorithms are deterministic, stateless, and the number of probes grows polynomially in logn\log^* n, where nn is the number of vertices of the input graph. The recursive centralized local improvement technique by Nguyen and Onak~\cite{onak2008} is employed to obtain an improved distributed approximation scheme for maximum (weighted) matching. The improvement is twofold: we reduce the number of rounds from O(logn)O(\log n) to O(logn)O(\log^*n) for a wide range of instances and, our algorithms are deterministic rather than randomized
    corecore