130 research outputs found

    Fast integer merging on the EREW PRAM

    Get PDF
    We investigate the complexity of merging sequences of small integers on the EREW PRAM. Our most surprising result is that two sorted sequences of nn bits each can be merged in O(loglogn)O(\log\log n) time. More generally, we describe an algorithm to merge two sorted sequences of nn integers drawn from the set {0,,m1}\{0,\ldots,m-1\} in O(loglogn+logm)O(\log\log n+\log m) time using an optimal number of processors. No sublogarithmic merging algorithm for this model of computation was previously known. The algorithm not only produces the merged sequence, but also computes the rank of each input element in the merged sequence. On the other hand, we show a lower bound of Ω(logmin{n,m})\Omega(\log\min\{n,m\}) on the time needed to merge two sorted sequences of length nn each with elements in the set {0,,m1}\{0,\ldots,m-1\}, implying that our merging algorithm is as fast as possible for m=(logn)Ω(1)m=(\log n)^{\Omega(1)}. If we impose an additional stability condition requiring the ranks of each input sequence to form an increasing sequence, then the time complexity of the problem becomes Θ(logn)\Theta(\log n), even for m=2m=2. Stable merging is thus harder than nonstable merging

    Optimal parallel string algorithms: sorting, merching and computing the minimum

    No full text
    We study fundamental comparison problems on strings of characters, equipped with the usual lexicographical ordering. For each problem studied, we give a parallel algorithm that is optimal with respect to at least one criterion for which no optimal algorithm was previously known. Specifically, our main results are: % \begin{itemize} \item Two sorted sequences of strings, containing altogether nn~characters, can be merged in O(logn)O(\log n) time using O(n)O(n) operations on an EREW PRAM. This is optimal as regards both the running time and the number of operations. \item A sequence of strings, containing altogether nn~characters represented by integers of size polynomial in~nn, can be sorted in O(logn/loglogn)O({{\log n}/{\log\log n}}) time using O(nloglogn)O(n\log\log n) operations on a CRCW PRAM. The running time is optimal for any polynomial number of processors. \item The minimum string in a sequence of strings containing altogether nn characters can be found using (expected) O(n)O(n) operations in constant expected time on a randomized CRCW PRAM, in O(loglogn)O(\log\log n) time on a deterministic CRCW PRAM with a program depending on~nn, in O((loglogn)3)O((\log\log n)^3) time on a deterministic CRCW PRAM with a program not depending on~nn, in O(logn)O(\log n) expected time on a randomized EREW PRAM, and in O(lognloglogn)O(\log n\log\log n) time on a deterministic EREW PRAM. The number of operations is optimal, and the running time is optimal for the randomized algorithms and, if the number of processors is limited to~nn, for the nonuniform deterministic CRCW PRAM algorithm as we

    Improved parallel integer sorting without concurrent writing

    No full text
    We show that nn integers in the range 1 \twodots n can be stably sorted on an \linebreak EREW PRAM using \nolinebreak O(t)O(t) time \linebreak and O(n(lognloglogn+(logn)2/t))O(n(\sqrt{\log n\log\log n}+{{(\log n)^2}/t})) operations, for arbitrary given \linebreak tlognloglognt\ge\log n\log\log n, and on a CREW PRAM using %O(lognloglogn)O(\log n\log\log n) time and O(nlogn)O(n\sqrt{\log n}) O(t)O(t) time and O(n(logn+logn/2t/logn))O(n(\sqrt{\log n}+{{\log n}/{2^{{t/{\log n}}}}})) operations, for arbitrary given tlognt\ge\log n. In addition, we are able to sort nn arbitrary integers on a randomized CREW PRAM % using %O(lognloglogn)O(\log n\log\log n) time and O(nlogn)O(n\sqrt{\log n}) operations within the same resource bounds with high probability. In each case our algorithm is a factor of almost Θ(logn)\Theta(\sqrt{\log n}) closer to optimality than all previous algorithms for the stated problem in the stated model, and our third result matches the operation count of the best known sequential algorithm. We also show that nn integers in the range 1 \twodots m can be sorted in O((logn)2)O((\log n)^2) time with O(n)O(n) operations on an EREW PRAM using a nonstandard word length of O(lognloglognlogm)O(\log n \log\log n \log m) bits, thereby greatly improving the upper bound on the word length necessary to sort integers with a linear time-processor product, even sequentially. Our algorithms were inspired by, and in one case directly use, the fusion trees of Fredman and Willard

    Parallel Algorithms for Summing Floating-Point Numbers

    Full text link
    The problem of exactly summing n floating-point numbers is a fundamental problem that has many applications in large-scale simulations and computational geometry. Unfortunately, due to the round-off error in standard floating-point operations, this problem becomes very challenging. Moreover, all existing solutions rely on sequential algorithms which cannot scale to the huge datasets that need to be processed. In this paper, we provide several efficient parallel algorithms for summing n floating point numbers, so as to produce a faithfully rounded floating-point representation of the sum. We present algorithms in PRAM, external-memory, and MapReduce models, and we also provide an experimental analysis of our MapReduce algorithms, due to their simplicity and practical efficiency.Comment: Conference version appears in SPAA 201

    Some Optimally Adaptive Parallel Graph Algorithms on EREW PRAM Model

    Get PDF
    The study of graph algorithms is an important area of research in computer science, since graphs offer useful tools to model many real-world situations. The commercial availability of parallel computers have led to the development of efficient parallel graph algorithms. Using an exclusive-read and exclusive-write (EREW) parallel random access machine (PRAM) as the computation model with a fixed number of processors, we design and analyze parallel algorithms for seven undirected graph problems, such as, connected components, spanning forest, fundamental cycle set, bridges, bipartiteness, assignment problems, and approximate vertex coloring. For all but the last two problems, the input data structure is an unordered list of edges, and divide-and-conquer is the paradigm for designing algorithms. One of the algorithms to solve the assignment problem makes use of an appropriate variant of dynamic programming strategy. An elegant data structure, called the adjacency list matrix, used in a vertex-coloring algorithm avoids the sequential nature of linked adjacency lists. Each of the proposed algorithms achieves optimal speedup, choosing an optimal granularity (thus exploiting maximum parallelism) which depends on the density or the number of vertices of the given graph. The processor-(time)2 product has been identified as a useful parameter to measure the cost-effectiveness of a parallel algorithm. We derive a lower bound on this measure for each of our algorithms

    Fast Parallel Operations on Search Trees

    Full text link
    Using (a,b)-trees as an example, we show how to perform a parallel split with logarithmic latency and parallel join, bulk updates, intersection, union (or merge), and (symmetric) set difference with logarithmic latency and with information theoretically optimal work. We present both asymptotically optimal solutions and simplified versions that perform well in practice - they are several times faster than previous implementations

    Fast Parallel Algorithms for Basic Problems

    Get PDF
    Parallel processing is one of the most active research areas these days. We are interested in one aspect of parallel processing, i.e. the design and analysis of parallel algorithms. Here, we focus on non-numerical parallel algorithms for basic combinatorial problems, such as data structures, selection, searching, merging and sorting. The purposes of studying these types of problems are to obtain basic building blocks which will be useful in solving complex problems, and to develop fundamental algorithmic techniques. In this thesis, we study the following problems: priority queues, multiple search and multiple selection, and reconstruction of a binary tree from its traversals. The research on priority queue was motivated by its various applications. The purpose of studying multiple search and multiple selection is to explore the relationships between four of the most fundamental problems in algorithm design, that is, selection, searching, merging and sorting; while our parallel solutions can be used as subroutines in algorithms for other problems. The research on the last problem, reconstruction of a binary tree from its traversals, was stimulated by a challenge proposed in a recent paper by Berkman et al. ( Highly Parallelizable Problems, STOC 89) to design doubly logarithmic time optimal parallel algorithms because a remarkably small number of such parallel algorithms exist

    A Parallel Algorithm for Computing Minimum Spanning Trees

    Get PDF
    We present a simple and implementable algorithm that computes a minimum spanning tree of an undirected weighted graph G = (V, E) of n = |V| vertices and m = |E| edges on an EREW PRAM in O(log 3=2 n) time using n+m processors. This represents a substantial improvement in the running time over the previous results for this problem using at the same time the weakest of the PRAM models. It also implies the existence of algorithms having the same complexity bounds for the EREW PRAM, for connectivity, ear decomposition, biconnectivity, strong orientation, st-numbering and Euler tours problems

    Parallel Transitive Closure and Point Location in Planar Structures

    Get PDF
    AMS(MOS) subject classifications. 68E05, 68C05, 68C25Parallel algorithms for several graph and geometric problems are presented, including transitive closure and topological sorting in planar st-graphs, preprocessing planar subdivisions for point location queries, and construction of visibility representations and drawings of planar graphs. Most of these algorithms achieve optimal O(logn) running time using n/logn processors in the EREW PRAM model, n being the number of vertices
    corecore