8,417 research outputs found

    Implicit Decomposition for Write-Efficient Connectivity Algorithms

    Full text link
    The future of main memory appears to lie in the direction of new technologies that provide strong capacity-to-performance ratios, but have write operations that are much more expensive than reads in terms of latency, bandwidth, and energy. Motivated by this trend, we propose sequential and parallel algorithms to solve graph connectivity problems using significantly fewer writes than conventional algorithms. Our primary algorithmic tool is the construction of an o(n)o(n)-sized "implicit decomposition" of a bounded-degree graph GG on nn nodes, which combined with read-only access to GG enables fast answers to connectivity and biconnectivity queries on GG. The construction breaks the linear-write "barrier", resulting in costs that are asymptotically lower than conventional algorithms while adding only a modest cost to querying time. For general non-sparse graphs on mm edges, we also provide the first o(m)o(m) writes and O(m)O(m) operations parallel algorithms for connectivity and biconnectivity. These algorithms provide insight into how applications can efficiently process computations on large graphs in systems with read-write asymmetry

    Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable

    Full text link
    There has been significant recent interest in parallel graph processing due to the need to quickly analyze the large graphs available today. Many graph codes have been designed for distributed memory or external memory. However, today even the largest publicly-available real-world graph (the Hyperlink Web graph with over 3.5 billion vertices and 128 billion edges) can fit in the memory of a single commodity multicore server. Nevertheless, most experimental work in the literature report results on much smaller graphs, and the ones for the Hyperlink graph use distributed or external memory. Therefore, it is natural to ask whether we can efficiently solve a broad class of graph problems on this graph in memory. This paper shows that theoretically-efficient parallel graph algorithms can scale to the largest publicly-available graphs using a single machine with a terabyte of RAM, processing them in minutes. We give implementations of theoretically-efficient parallel algorithms for 20 important graph problems. We also present the optimizations and techniques that we used in our implementations, which were crucial in enabling us to process these large graphs quickly. We show that the running times of our implementations outperform existing state-of-the-art implementations on the largest real-world graphs. For many of the problems that we consider, this is the first time they have been solved on graphs at this scale. We have made the implementations developed in this work publicly-available as the Graph-Based Benchmark Suite (GBBS).Comment: This is the full version of the paper appearing in the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), 201

    Requirements for implementing real-time control functional modules on a hierarchical parallel pipelined system

    Get PDF
    Analysis of a robot control system leads to a broad range of processing requirements. One fundamental requirement of a robot control system is the necessity of a microcomputer system in order to provide sufficient processing capability.The use of multiple processors in a parallel architecture is beneficial for a number of reasons, including better cost performance, modular growth, increased reliability through replication, and flexibility for testing alternate control strategies via different partitioning. A survey of the progression from low level control synchronizing primitives to higher level communication tools is presented. The system communication and control mechanisms of existing robot control systems are compared to the hierarchical control model. The impact of this design methodology on the current robot control systems is explored

    Mapping constrained optimization problems to quantum annealing with application to fault diagnosis

    Get PDF
    Current quantum annealing (QA) hardware suffers from practical limitations such as finite temperature, sparse connectivity, small qubit numbers, and control error. We propose new algorithms for mapping boolean constraint satisfaction problems (CSPs) onto QA hardware mitigating these limitations. In particular we develop a new embedding algorithm for mapping a CSP onto a hardware Ising model with a fixed sparse set of interactions, and propose two new decomposition algorithms for solving problems too large to map directly into hardware. The mapping technique is locally-structured, as hardware compatible Ising models are generated for each problem constraint, and variables appearing in different constraints are chained together using ferromagnetic couplings. In contrast, global embedding techniques generate a hardware independent Ising model for all the constraints, and then use a minor-embedding algorithm to generate a hardware compatible Ising model. We give an example of a class of CSPs for which the scaling performance of D-Wave's QA hardware using the local mapping technique is significantly better than global embedding. We validate the approach by applying D-Wave's hardware to circuit-based fault-diagnosis. For circuits that embed directly, we find that the hardware is typically able to find all solutions from a min-fault diagnosis set of size N using 1000N samples, using an annealing rate that is 25 times faster than a leading SAT-based sampling method. Further, we apply decomposition algorithms to find min-cardinality faults for circuits that are up to 5 times larger than can be solved directly on current hardware.Comment: 22 pages, 4 figure

    Distance labeling schemes for trees

    Get PDF
    We consider distance labeling schemes for trees: given a tree with nn nodes, label the nodes with binary strings such that, given the labels of any two nodes, one can determine, by looking only at the labels, the distance in the tree between the two nodes. A lower bound by Gavoille et. al. (J. Alg. 2004) and an upper bound by Peleg (J. Graph Theory 2000) establish that labels must use Θ(log2n)\Theta(\log^2 n) bits\footnote{Throughout this paper we use log\log for log2\log_2.}. Gavoille et. al. (ESA 2001) show that for very small approximate stretch, labels use Θ(lognloglogn)\Theta(\log n \log \log n) bits. Several other papers investigate various variants such as, for example, small distances in trees (Alstrup et. al., SODA'03). We improve the known upper and lower bounds of exact distance labeling by showing that 14log2n\frac{1}{4} \log^2 n bits are needed and that 12log2n\frac{1}{2} \log^2 n bits are sufficient. We also give (1+ϵ1+\epsilon)-stretch labeling schemes using Θ(logn)\Theta(\log n) bits for constant ϵ>0\epsilon>0. (1+ϵ1+\epsilon)-stretch labeling schemes with polylogarithmic label size have previously been established for doubling dimension graphs by Talwar (STOC 2004). In addition, we present matching upper and lower bounds for distance labeling for caterpillars, showing that labels must have size 2lognΘ(loglogn)2\log n - \Theta(\log\log n). For simple paths with kk nodes and edge weights in [1,n][1,n], we show that labels must have size k1klogn+Θ(logk)\frac{k-1}{k}\log n+\Theta(\log k)

    Single Source - All Sinks Max Flows in Planar Digraphs

    Full text link
    Let G = (V,E) be a planar n-vertex digraph. Consider the problem of computing max st-flow values in G from a fixed source s to all sinks t in V\{s}. We show how to solve this problem in near-linear O(n log^3 n) time. Previously, no better solution was known than running a single-source single-sink max flow algorithm n-1 times, giving a total time bound of O(n^2 log n) with the algorithm of Borradaile and Klein. An important implication is that all-pairs max st-flow values in G can be computed in near-quadratic time. This is close to optimal as the output size is Theta(n^2). We give a quadratic lower bound on the number of distinct max flow values and an Omega(n^3) lower bound for the total size of all min cut-sets. This distinguishes the problem from the undirected case where the number of distinct max flow values is O(n). Previous to our result, no algorithm which could solve the all-pairs max flow values problem faster than the time of Theta(n^2) max-flow computations for every planar digraph was known. This result is accompanied with a data structure that reports min cut-sets. For fixed s and all t, after O(n^{3/2} log^{3/2} n) preprocessing time, it can report the set of arcs C crossing a min st-cut in time roughly proportional to the size of C.Comment: 25 pages, 4 figures; extended abstract appeared in FOCS 201

    Computing with Tangles

    Full text link
    Tangles of graphs have been introduced by Robertson and Seymour in the context of their graph minor theory. Tangles may be viewed as describing "k-connected components" of a graph (though in a twisted way). They play an important role in graph minor theory. An interesting aspect of tangles is that they cannot only be defined for graphs, but more generally for arbitrary connectivity functions (that is, integer-valued submodular and symmetric set functions). However, tangles are difficult to deal with algorithmically. To start with, it is unclear how to represent them, because they are families of separations and as such may be exponentially large. Our first contribution is a data structure for representing and accessing all tangles of a graph up to some fixed order. Using this data structure, we can prove an algorithmic version of a very general structure theorem due to Carmesin, Diestel, Harman and Hundertmark (for graphs) and Hundertmark (for arbitrary connectivity functions) that yields a canonical tree decomposition whose parts correspond to the maximal tangles. (This may be viewed as a generalisation of the decomposition of a graph into its 3-connected components.
    corecore