1,009 research outputs found

    Algorithms in the Ultra-Wide Word Model

    Full text link
    The effective use of parallel computing resources to speed up algorithms in current multi-core parallel architectures remains a difficult challenge, with ease of programming playing a key role in the eventual success of various parallel architectures. In this paper we consider an alternative view of parallelism in the form of an ultra-wide word processor. We introduce the Ultra-Wide Word architecture and model, an extension of the word-RAM model that allows for constant time operations on thousands of bits in parallel. Word parallelism as exploited by the word-RAM model does not suffer from the more difficult aspects of parallel programming, namely synchronization and concurrency. For the standard word-RAM algorithms, the speedups obtained are moderate, as they are limited by the word size. We argue that a large class of word-RAM algorithms can be implemented in the Ultra-Wide Word model, obtaining speedups comparable to multi-threaded computations while keeping the simplicity of programming of the sequential RAM model. We show that this is the case by describing implementations of Ultra-Wide Word algorithms for dynamic programming and string searching. In addition, we show that the Ultra-Wide Word model can be used to implement a nonstandard memory architecture, which enables the sidestepping of lower bounds of important data structure problems such as priority queues and dynamic prefix sums. While similar ideas about operating on large words have been mentioned before in the context of multimedia processors [Thorup 2003], it is only recently that an architecture like the one we propose has become feasible and that details can be worked out.Comment: 28 pages, 5 figures; minor change

    Analysis of Quickselect under Yaroslavskiy's Dual-Pivoting Algorithm

    Full text link
    There is excitement within the algorithms community about a new partitioning method introduced by Yaroslavskiy. This algorithm renders Quicksort slightly faster than the case when it runs under classic partitioning methods. We show that this improved performance in Quicksort is not sustained in Quickselect; a variant of Quicksort for finding order statistics. We investigate the number of comparisons made by Quickselect to find a key with a randomly selected rank under Yaroslavskiy's algorithm. This grand averaging is a smoothing operator over all individual distributions for specific fixed order statistics. We give the exact grand average. The grand distribution of the number of comparison (when suitably scaled) is given as the fixed-point solution of a distributional equation of a contraction in the Zolotarev metric space. Our investigation shows that Quickselect under older partitioning methods slightly outperforms Quickselect under Yaroslavskiy's algorithm, for an order statistic of a random rank. Similar results are obtained for extremal order statistics, where again we find the exact average, and the distribution for the number of comparisons (when suitably scaled). Both limiting distributions are of perpetuities (a sum of products of independent mixed continuous random variables).Comment: full version with appendices; otherwise identical to Algorithmica versio

    Selection from Heaps, Row-Sorted Matrices, and X+Y Using Soft Heaps

    Get PDF
    We use soft heaps to obtain simpler optimal algorithms for selecting the k-th smallest item, and the set of k smallest items, from a heap-ordered tree, from a collection of sorted lists, and from X+Y, where X and Y are two unsorted sets. Our results match, and in some ways extend and improve, classical results of Frederickson (1993) and Frederickson and Johnson (1982). In particular, for selecting the k-th smallest item, or the set of k smallest items, from a collection of m sorted lists we obtain a new optimal "output-sensitive" algorithm that performs only O(m + sum_{i=1}^m log(k_i+1)) comparisons, where k_i is the number of items of the i-th list that belong to the overall set of k smallest items

    Improved Bounds for 3SUM, kk-SUM, and Linear Degeneracy

    Get PDF
    Given a set of nn real numbers, the 3SUM problem is to decide whether there are three of them that sum to zero. Until a recent breakthrough by Gr{\o}nlund and Pettie [FOCS'14], a simple Θ(n2)\Theta(n^2)-time deterministic algorithm for this problem was conjectured to be optimal. Over the years many algorithmic problems have been shown to be reducible from the 3SUM problem or its variants, including the more generalized forms of the problem, such as kk-SUM and kk-variate linear degeneracy testing (kk-LDT). The conjectured hardness of these problems have become extremely popular for basing conditional lower bounds for numerous algorithmic problems in P. In this paper, we show that the randomized 44-linear decision tree complexity of 3SUM is O(n3/2)O(n^{3/2}), and that the randomized (2k2)(2k-2)-linear decision tree complexity of kk-SUM and kk-LDT is O(nk/2)O(n^{k/2}), for any odd k3k\ge 3. These bounds improve (albeit randomized) the corresponding O(n3/2logn)O(n^{3/2}\sqrt{\log n}) and O(nk/2logn)O(n^{k/2}\sqrt{\log n}) decision tree bounds obtained by Gr{\o}nlund and Pettie. Our technique includes a specialized randomized variant of fractional cascading data structure. Additionally, we give another deterministic algorithm for 3SUM that runs in O(n2loglogn/logn)O(n^2 \log\log n / \log n ) time. The latter bound matches a recent independent bound by Freund [Algorithmica 2017], but our algorithm is somewhat simpler, due to a better use of word-RAM model
    corecore