87 research outputs found

    Work-Efficient Query Evaluation with PRAMs

    Get PDF
    The paper studies query evaluation in parallel constant time in the PRAM model. While it is well-known that all relational algebra queries can be evaluated in constant time on an appropriate CRCW-PRAM, this paper is interested in the efficiency of evaluation algorithms, that is, in the number of processors or, asymptotically equivalent, in the work. Naive evaluation in the parallel setting results in huge (polynomial) bounds on the work of such algorithms and in presentations of the result sets that can be extremely scattered in memory. The paper first discusses some obstacles for constant time PRAM query evaluation. It presents algorithms for relational operators that are considerably more efficient than the naive approaches. Further it explores three settings, in which efficient sequential query evaluation algorithms exist: acyclic queries, semi-join algebra queries, and join queries - the latter in the worst-case optimal framework. Under natural assumptions on the representation of the database, the work of the given algorithms matches the best sequential algorithms in the case of semi-join queries, and it comes close in the other two settings. An important tool is the compaction technique from Hagerup (1992)

    Shared memory with hidden latency on a family of mesh-like networks

    Get PDF

    Deterministic Computations on a PRAM with Static Processor and Memory Faults.

    Get PDF
    We consider Parallel Random Access Machine (PRAM) which has some processors and memory cells faulty. The faults considered are static, i.e., once the machine starts to operate, the operational/faulty status of PRAM components does not change. We develop a deterministic simulation of a fully operational PRAM on a similar faulty machine which has constant fractions of faults among processors and memory cells. The simulating PRAM has nn processors and mm memory cells, and simulates a PRAM with nn processors and a constant fraction of mm memory cells. The simulation is in two phases: it starts with preprocessing, which is followed by the simulation proper performed in a step-by-step fashion. Preprocessing is performed in time O((mn+logn)logn)O((\frac{m}{n}+ \log n)\log n). The slowdown of a step-by-step part of the simulation is O(logm)O(\log m)

    Efficient Circuit Simulation in MapReduce

    Get PDF
    The MapReduce framework has firmly established itself as one of the most widely used parallel computing platforms for processing big data on tera- and peta-byte scale. Approaching it from a theoretical standpoint has proved to be notoriously difficult, however. In continuation of Goodrich et al.\u27s early efforts, explicitly espousing the goal of putting the MapReduce framework on footing equal to that of long-established models such as the PRAM, we investigate the obvious complexity question of how the computational power of MapReduce algorithms compares to that of combinational Boolean circuits commonly used for parallel computations. Relying on the standard MapReduce model introduced by Karloff et al. a decade ago, we develop an intricate simulation technique to show that any problem in NC (i.e., a problem solved by a logspace-uniform family of Boolean circuits of polynomial size and a depth polylogarithmic in the input size) can be solved by a MapReduce computation in O(T(n)/log n) rounds, where n is the input size and T(n) is the depth of the witnessing circuit family. Thus, we are able to closely relate the standard, uniform NC hierarchy modeling parallel computations to the deterministic MapReduce hierarchy DMRC by proving that NC^{i+1} subseteq DMRC^i for all i in N. Besides the theoretical significance, this result has important applied aspects as well. In particular, we show for all problems in NC^1 - many practically relevant ones, such as integer multiplication and division and the parity function, being among these - how to solve them in a constant number of deterministic MapReduce rounds

    Algorithmic Motion Planning and Related Geometric Problems on Parallel Machines (Dissertation Proposal)

    Get PDF
    The problem of algorithmic motion planning is one that has received considerable attention in recent years. The automatic planning of motion for a mobile object moving amongst obstacles is a fundamentally important problem with numerous applications in computer graphics and robotics. Numerous approximate techniques (AI-based, heuristics-based, potential field methods, for example) for motion planning have long been in existence, and have resulted in the design of experimental systems that work reasonably well under various special conditions [7, 29, 30]. Our interest in this problem, however, is in the use of algorithmic techniques for motion planning, with provable worst case performance guarantees. The study of algorithmic motion planning has been spurred by recent research that has established the mathematical depth of motion planning. Classical geometry, algebra, algebraic geometry and combinatorics are some of the fields of mathematics that have been used to prove various results that have provided better insight into the issues involved in motion planning [49]. In particular, the design and analysis of geometric algorithms has proved to be very useful for numerous important special cases. In the remainder of this proposal we will substitute the more precise term of algorithmic motion planning by just motion planning

    Fast integer merging on the EREW PRAM

    Get PDF
    We investigate the complexity of merging sequences of small integers on the EREW PRAM. Our most surprising result is that two sorted sequences of nn bits each can be merged in O(loglogn)O(\log\log n) time. More generally, we describe an algorithm to merge two sorted sequences of nn integers drawn from the set {0,,m1}\{0,\ldots,m-1\} in O(loglogn+logm)O(\log\log n+\log m) time using an optimal number of processors. No sublogarithmic merging algorithm for this model of computation was previously known. The algorithm not only produces the merged sequence, but also computes the rank of each input element in the merged sequence. On the other hand, we show a lower bound of Ω(logmin{n,m})\Omega(\log\min\{n,m\}) on the time needed to merge two sorted sequences of length nn each with elements in the set {0,,m1}\{0,\ldots,m-1\}, implying that our merging algorithm is as fast as possible for m=(logn)Ω(1)m=(\log n)^{\Omega(1)}. If we impose an additional stability condition requiring the ranks of each input sequence to form an increasing sequence, then the time complexity of the problem becomes Θ(logn)\Theta(\log n), even for m=2m=2. Stable merging is thus harder than nonstable merging

    Fast Computation of Small Cuts via Cycle Space Sampling

    Full text link
    We describe a new sampling-based method to determine cuts in an undirected graph. For a graph (V, E), its cycle space is the family of all subsets of E that have even degree at each vertex. We prove that with high probability, sampling the cycle space identifies the cuts of a graph. This leads to simple new linear-time sequential algorithms for finding all cut edges and cut pairs (a set of 2 edges that form a cut) of a graph. In the model of distributed computing in a graph G=(V, E) with O(log V)-bit messages, our approach yields faster algorithms for several problems. The diameter of G is denoted by Diam, and the maximum degree by Delta. We obtain simple O(Diam)-time distributed algorithms to find all cut edges, 2-edge-connected components, and cut pairs, matching or improving upon previous time bounds. Under natural conditions these new algorithms are universally optimal --- i.e. a Omega(Diam)-time lower bound holds on every graph. We obtain a O(Diam+Delta/log V)-time distributed algorithm for finding cut vertices; this is faster than the best previous algorithm when Delta, Diam = O(sqrt(V)). A simple extension of our work yields the first distributed algorithm with sub-linear time for 3-edge-connected components. The basic distributed algorithms are Monte Carlo, but they can be made Las Vegas without increasing the asymptotic complexity. In the model of parallel computing on the EREW PRAM our approach yields a simple algorithm with optimal time complexity O(log V) for finding cut pairs and 3-edge-connected components.Comment: Previous version appeared in Proc. 35th ICALP, pages 145--160, 200
    corecore