40,700 research outputs found

    Algorithms in the Ultra-Wide Word Model

    Full text link
    The effective use of parallel computing resources to speed up algorithms in current multi-core parallel architectures remains a difficult challenge, with ease of programming playing a key role in the eventual success of various parallel architectures. In this paper we consider an alternative view of parallelism in the form of an ultra-wide word processor. We introduce the Ultra-Wide Word architecture and model, an extension of the word-RAM model that allows for constant time operations on thousands of bits in parallel. Word parallelism as exploited by the word-RAM model does not suffer from the more difficult aspects of parallel programming, namely synchronization and concurrency. For the standard word-RAM algorithms, the speedups obtained are moderate, as they are limited by the word size. We argue that a large class of word-RAM algorithms can be implemented in the Ultra-Wide Word model, obtaining speedups comparable to multi-threaded computations while keeping the simplicity of programming of the sequential RAM model. We show that this is the case by describing implementations of Ultra-Wide Word algorithms for dynamic programming and string searching. In addition, we show that the Ultra-Wide Word model can be used to implement a nonstandard memory architecture, which enables the sidestepping of lower bounds of important data structure problems such as priority queues and dynamic prefix sums. While similar ideas about operating on large words have been mentioned before in the context of multimedia processors [Thorup 2003], it is only recently that an architecture like the one we propose has become feasible and that details can be worked out.Comment: 28 pages, 5 figures; minor change

    Cell-Probe Bounds for Online Edit Distance and Other Pattern Matching Problems

    Full text link
    We give cell-probe bounds for the computation of edit distance, Hamming distance, convolution and longest common subsequence in a stream. In this model, a fixed string of nn symbols is given and one δ\delta-bit symbol arrives at a time in a stream. After each symbol arrives, the distance between the fixed string and a suffix of most recent symbols of the stream is reported. The cell-probe model is perhaps the strongest model of computation for showing data structure lower bounds, subsuming in particular the popular word-RAM model. * We first give an Ω((δlogn)/(w+loglogn))\Omega((\delta \log n)/(w+\log\log n)) lower bound for the time to give each output for both online Hamming distance and convolution, where ww is the word size. This bound relies on a new encoding scheme and for the first time holds even when ww is as small as a single bit. * We then consider the online edit distance and longest common subsequence problems in the bit-probe model (w=1w=1) with a constant sized input alphabet. We give a lower bound of Ω(logn/(loglogn)3/2)\Omega(\sqrt{\log n}/(\log\log n)^{3/2}) which applies for both problems. This second set of results relies both on our new encoding scheme as well as a carefully constructed hard distribution. * Finally, for the online edit distance problem we show that there is an O((logn)2/w)O((\log n)^2/w) upper bound in the cell-probe model. This bound gives a contrast to our new lower bound and also establishes an exponential gap between the known cell-probe and RAM model complexities.Comment: 32 pages, 4 figure

    Energy-Efficient Algorithms

    Full text link
    We initiate the systematic study of the energy complexity of algorithms (in addition to time and space complexity) based on Landauer's Principle in physics, which gives a lower bound on the amount of energy a system must dissipate if it destroys information. We propose energy-aware variations of three standard models of computation: circuit RAM, word RAM, and transdichotomous RAM. On top of these models, we build familiar high-level primitives such as control logic, memory allocation, and garbage collection with zero energy complexity and only constant-factor overheads in space and time complexity, enabling simple expression of energy-efficient algorithms. We analyze several classic algorithms in our models and develop low-energy variations: comparison sort, insertion sort, counting sort, breadth-first search, Bellman-Ford, Floyd-Warshall, matrix all-pairs shortest paths, AVL trees, binary heaps, and dynamic arrays. We explore the time/space/energy trade-off and develop several general techniques for analyzing algorithms and reducing their energy complexity. These results lay a theoretical foundation for a new field of semi-reversible computing and provide a new framework for the investigation of algorithms.Comment: 40 pages, 8 pdf figures, full version of work published in ITCS 201

    Faster Approximate String Matching for Short Patterns

    Full text link
    We study the classical approximate string matching problem, that is, given strings PP and QQ and an error threshold kk, find all ending positions of substrings of QQ whose edit distance to PP is at most kk. Let PP and QQ have lengths mm and nn, respectively. On a standard unit-cost word RAM with word size wlognw \geq \log n we present an algorithm using time O(nkmin(log2mlogn,log2mlogww)+n) O(nk \cdot \min(\frac{\log^2 m}{\log n},\frac{\log^2 m\log w}{w}) + n) When PP is short, namely, m=2o(logn)m = 2^{o(\sqrt{\log n})} or m=2o(w/logw)m = 2^{o(\sqrt{w/\log w})} this improves the previously best known time bounds for the problem. The result is achieved using a novel implementation of the Landau-Vishkin algorithm based on tabulation and word-level parallelism.Comment: To appear in Theory of Computing System

    Deterministic Time-Space Tradeoffs for k-SUM

    Get PDF
    Given a set of numbers, the kk-SUM problem asks for a subset of kk numbers that sums to zero. When the numbers are integers, the time and space complexity of kk-SUM is generally studied in the word-RAM model; when the numbers are reals, the complexity is studied in the real-RAM model, and space is measured by the number of reals held in memory at any point. We present a time and space efficient deterministic self-reduction for the kk-SUM problem which holds for both models, and has many interesting consequences. To illustrate: * 33-SUM is in deterministic time O(n2lglg(n)/lg(n))O(n^2 \lg\lg(n)/\lg(n)) and space O(nlg(n)lglg(n))O\left(\sqrt{\frac{n \lg(n)}{\lg\lg(n)}}\right). In general, any polylogarithmic-time improvement over quadratic time for 33-SUM can be converted into an algorithm with an identical time improvement but low space complexity as well. * 33-SUM is in deterministic time O(n2)O(n^2) and space O(n)O(\sqrt n), derandomizing an algorithm of Wang. * A popular conjecture states that 3-SUM requires n2o(1)n^{2-o(1)} time on the word-RAM. We show that the 3-SUM Conjecture is in fact equivalent to the (seemingly weaker) conjecture that every O(n.51)O(n^{.51})-space algorithm for 33-SUM requires at least n2o(1)n^{2-o(1)} time on the word-RAM. * For k4k \ge 4, kk-SUM is in deterministic O(nk2+2/k)O(n^{k - 2 + 2/k}) time and O(n)O(\sqrt{n}) space

    A mixed-signal integrated circuit for FM-DCSK modulation

    Get PDF
    This paper presents a mixed-signal application-specific integrated circuit (ASIC) for a frequency-modulated differential chaos shift keying (FM-DCSK) communication system. The chip is conceived to serve as an experimental platform for the evaluation of the FM-DCSK modulation scheme, and includes several programming features toward this goal. The operation of the ASIC is herein illustrated for a data rate of 500 kb/s and a transmission bandwidth in the range of 17 MHz. Using signals acquired from the test platform, bit error rate (BER) estimations of the overall FM-DCSK communication link have been obtained assuming wireless transmission at the 2.4-GHz ISM band. Under all tested propagation conditions, including multipath effects, the system obtains a BER = 10-3 for Eb/No lower than 28 dB.Ministerio de Ciencia y Tecnología TIC2003-0235
    corecore