1,009 research outputs found
Algorithms in the Ultra-Wide Word Model
The effective use of parallel computing resources to speed up algorithms in
current multi-core parallel architectures remains a difficult challenge, with
ease of programming playing a key role in the eventual success of various
parallel architectures. In this paper we consider an alternative view of
parallelism in the form of an ultra-wide word processor. We introduce the
Ultra-Wide Word architecture and model, an extension of the word-RAM model that
allows for constant time operations on thousands of bits in parallel. Word
parallelism as exploited by the word-RAM model does not suffer from the more
difficult aspects of parallel programming, namely synchronization and
concurrency. For the standard word-RAM algorithms, the speedups obtained are
moderate, as they are limited by the word size. We argue that a large class of
word-RAM algorithms can be implemented in the Ultra-Wide Word model, obtaining
speedups comparable to multi-threaded computations while keeping the simplicity
of programming of the sequential RAM model. We show that this is the case by
describing implementations of Ultra-Wide Word algorithms for dynamic
programming and string searching. In addition, we show that the Ultra-Wide Word
model can be used to implement a nonstandard memory architecture, which enables
the sidestepping of lower bounds of important data structure problems such as
priority queues and dynamic prefix sums. While similar ideas about operating on
large words have been mentioned before in the context of multimedia processors
[Thorup 2003], it is only recently that an architecture like the one we propose
has become feasible and that details can be worked out.Comment: 28 pages, 5 figures; minor change
Analysis of Quickselect under Yaroslavskiy's Dual-Pivoting Algorithm
There is excitement within the algorithms community about a new partitioning
method introduced by Yaroslavskiy. This algorithm renders Quicksort slightly
faster than the case when it runs under classic partitioning methods. We show
that this improved performance in Quicksort is not sustained in Quickselect; a
variant of Quicksort for finding order statistics. We investigate the number of
comparisons made by Quickselect to find a key with a randomly selected rank
under Yaroslavskiy's algorithm. This grand averaging is a smoothing operator
over all individual distributions for specific fixed order statistics. We give
the exact grand average. The grand distribution of the number of comparison
(when suitably scaled) is given as the fixed-point solution of a distributional
equation of a contraction in the Zolotarev metric space. Our investigation
shows that Quickselect under older partitioning methods slightly outperforms
Quickselect under Yaroslavskiy's algorithm, for an order statistic of a random
rank. Similar results are obtained for extremal order statistics, where again
we find the exact average, and the distribution for the number of comparisons
(when suitably scaled). Both limiting distributions are of perpetuities (a sum
of products of independent mixed continuous random variables).Comment: full version with appendices; otherwise identical to Algorithmica
versio
Selection from Heaps, Row-Sorted Matrices, and X+Y Using Soft Heaps
We use soft heaps to obtain simpler optimal algorithms for selecting the k-th smallest item, and the set of k smallest items, from a heap-ordered tree, from a collection of sorted lists, and from X+Y, where X and Y are two unsorted sets. Our results match, and in some ways extend and improve, classical results of Frederickson (1993) and Frederickson and Johnson (1982). In particular, for selecting the k-th smallest item, or the set of k smallest items, from a collection of m sorted lists we obtain a new optimal "output-sensitive" algorithm that performs only O(m + sum_{i=1}^m log(k_i+1)) comparisons, where k_i is the number of items of the i-th list that belong to the overall set of k smallest items
Improved Bounds for 3SUM, -SUM, and Linear Degeneracy
Given a set of real numbers, the 3SUM problem is to decide whether there
are three of them that sum to zero. Until a recent breakthrough by Gr{\o}nlund
and Pettie [FOCS'14], a simple -time deterministic algorithm for
this problem was conjectured to be optimal. Over the years many algorithmic
problems have been shown to be reducible from the 3SUM problem or its variants,
including the more generalized forms of the problem, such as -SUM and
-variate linear degeneracy testing (-LDT). The conjectured hardness of
these problems have become extremely popular for basing conditional lower
bounds for numerous algorithmic problems in P.
In this paper, we show that the randomized -linear decision tree
complexity of 3SUM is , and that the randomized -linear
decision tree complexity of -SUM and -LDT is , for any odd
. These bounds improve (albeit randomized) the corresponding
and decision tree bounds
obtained by Gr{\o}nlund and Pettie. Our technique includes a specialized
randomized variant of fractional cascading data structure. Additionally, we
give another deterministic algorithm for 3SUM that runs in time. The latter bound matches a recent independent bound by Freund
[Algorithmica 2017], but our algorithm is somewhat simpler, due to a better use
of word-RAM model
- …