117 research outputs found

    A Randomized Sublinear Time Parallel GCD Algorithm for the EREW PRAM

    Get PDF
    We present a randomized parallel algorithm that computes the greatest common divisor of two integers of n bits in length with probability 1-o(1) that takes O(n loglog n / log n) expected time using n^{6+\epsilon} processors on the EREW PRAM parallel model of computation. We believe this to be the first randomized sublinear time algorithm on the EREW PRAM for this problem

    Polylog Depth Circuits for Integer Factoring and Discrete Logarithms

    Get PDF
    AbstractIn this paper, we develop parallel algorithms for integer factoring and for computing discrete logarithms. In particular, we give polylog depth probabilistic boolean circuits of subexponential size for both of these problems, thereby solving an open problem of Adleman and Kompella. Existing sequential algorithms for integer factoring and discrete logarithms use a prime base which is the set of all primes up to a bound B. We use a much smaller value for B for our parallel algorithms than is typical for sequential algorithms. In particular, for inputs of length n, by setting B = nlogdn with d a positive constant, we construct •Probabilistic boolean circuits of depth (log) and size exp[(/log)] for completely factoring a positive integer with probability 1−(1), and •Probabilistic boolean circuits of depth (log + log) and size exp[(/log)] for computing discrete logarithms in the finite field () for a prime with probability 1−(1). These are the first results of this type for both problem

    Streaming Periodicity with Mismatches

    Get PDF
    We study the problem of finding all k-periods of a length-n string S, presented as a data stream. S is said to have k-period p if its prefix of length n-p differs from its suffix of length n-p in at most k locations. We give a one-pass streaming algorithm that computes the k-periods of a string S using poly(k, log n) bits of space, for k-periods of length at most n/2. We also present a two-pass streaming algorithm that computes k-periods of S using poly(k, log n) bits of space, regardless of period length. We complement these results with comparable lower bounds

    Testing non-uniform k-wise independent distributions over product spaces (extended abstract)

    Get PDF
    A distribution D over Σ1× ⋯ ×Σ n is called (non-uniform) k-wise independent if for any set of k indices {i 1, ..., i k } and for any z1zki1ik, PrXD[Xi1Xik=z1zk]=PrXD[Xi1=z1]PrXD[Xik=zk]. We study the problem of testing (non-uniform) k-wise independent distributions over product spaces. For the uniform case we show an upper bound on the distance between a distribution D from the set of k-wise independent distributions in terms of the sum of Fourier coefficients of D at vectors of weight at most k. Such a bound was previously known only for the binary field. For the non-uniform case, we give a new characterization of distributions being k-wise independent and further show that such a characterization is robust. These greatly generalize the results of Alon et al. [1] on uniform k-wise independence over the binary field to non-uniform k-wise independence over product spaces. Our results yield natural testing algorithms for k-wise independence with time and sample complexity sublinear in terms of the support size when k is a constant. The main technical tools employed include discrete Fourier transforms and the theory of linear systems of congruences.National Science Foundation (U.S.) (NSF grant 0514771)National Science Foundation (U.S.) (grant 0728645)National Science Foundation (U.S.) (Grant 0732334)Marie Curie International Reintegration Grants (Grant PIRG03-GA-2008-231077)Israel Science Foundation (Grant 1147/09)Israel Science Foundation (Grant 1675/09)Massachusetts Institute of Technology (Akamai Presidential Fellowship

    Optimal Substring-Equality Queries with Applications to Sparse Text Indexing

    Full text link
    We consider the problem of encoding a string of length nn from an integer alphabet of size σ\sigma so that access and substring equality queries (that is, determining the equality of any two substrings) can be answered efficiently. Any uniquely-decodable encoding supporting access must take nlogσ+Θ(log(nlogσ))n\log\sigma + \Theta(\log (n\log\sigma)) bits. We describe a new data structure matching this lower bound when σnO(1)\sigma\leq n^{O(1)} while supporting both queries in optimal O(1)O(1) time. Furthermore, we show that the string can be overwritten in-place with this structure. The redundancy of Θ(logn)\Theta(\log n) bits and the constant query time break exponentially a lower bound that is known to hold in the read-only model. Using our new string representation, we obtain the first in-place subquadratic (indeed, even sublinear in some cases) algorithms for several string-processing problems in the restore model: the input string is rewritable and must be restored before the computation terminates. In particular, we describe the first in-place subquadratic Monte Carlo solutions to the sparse suffix sorting, sparse LCP array construction, and suffix selection problems. With the sole exception of suffix selection, our algorithms are also the first running in sublinear time for small enough sets of input suffixes. Combining these solutions, we obtain the first sublinear-time Monte Carlo algorithm for building the sparse suffix tree in compact space. We also show how to derandomize our algorithms using small space. This leads to the first Las Vegas in-place algorithm computing the full LCP array in O(nlogn)O(n\log n) time and to the first Las Vegas in-place algorithms solving the sparse suffix sorting and sparse LCP array construction problems in O(n1.5logσ)O(n^{1.5}\sqrt{\log \sigma}) time. Running times of these Las Vegas algorithms hold in the worst case with high probability.Comment: Refactored according to TALG's reviews. New w.h.p. bounds and Las Vegas algorithm

    Approximating Properties of Data Streams

    Get PDF
    In this dissertation, we present algorithms that approximate properties in the data stream model, where elements of an underlying data set arrive sequentially, but algorithms must use space sublinear in the size of the underlying data set. We first study the problem of finding all k-periods of a length-n string S, presented as a data stream. S is said to have k-period p if its prefix of length n − p differs from its suffix of length n − p in at most k locations. We give algorithms to compute the k-periods of a string S using poly(k, log n) bits of space and we complement these results with comparable lower bounds. We then study the problem of identifying a longest substring of strings S and T of length n that forms a d-near-alignment under the edit distance, in the simultaneous streaming model. In this model, symbols of strings S and T are streamed at the same time and form a d-near-alignment if the distance between them in some given metric is at most d. We give several algorithms, including an exact one-pass algorithm that uses O(d2 + d log n) bits of space. We then consider the distinct elements and `p-heavy hitters problems in the sliding window model, where only the most recent n elements in the data stream form the underlying set. We first introduce the composable histogram, a simple twist on the exponential (Datar et al., SODA 2002) and smooth histograms (Braverman and Ostrovsky, FOCS 2007) that may be of independent interest. We then show that the composable histogram along with a careful combination of existing techniques to track either the identity or frequency of a few specific items suffices to obtain algorithms for both distinct elements and `p-heavy hitters that is nearly optimal in both n and c. Finally, we consider the problem of estimating the maximum weighted matching of a graph whose edges are revealed in a streaming fashion. We develop a reduction from the maximum weighted matching problem to the maximum cardinality matching problem that only doubles the approximation factor of a streaming algorithm developed for the maximum cardinality matching problem. As an application, we obtain an estimator for the weight of a maximum weighted matching in bounded-arboricity graphs and in particular, a (48 + )-approximation estimator for the weight of a maximum weighted matching in planar graphs

    A distributed wheel sieve algorithm using Scheduling by Multiple Edge Reversal

    No full text
    Number of pages: 12This paper presents a new distributed approach for generating all prime numbers in a given interval of integers. From Eratosthenes, who elaborated the first prime sieve (more than 2000 years ago), to the current generation of parallel computers, which have permitted to reach larger bounds on the interval or to obtain previous results in a shorter time, prime numbers generation still represents an attractive domain of research and plays a central role in cryptography. We propose a fully distributed algorithm for finding all primes in the interval [2,n][2\ldots, n], based on the \emph{wheel sieve} and the SMER (\emph{Scheduling by Multiple Edge Reversal}) multigraph dynamics. Given a multigraph M\mathcal{M} of arbitrary topology, having NN nodes, an SMER-driven system is defined by the number of directed edges (arcs) between any two nodes of M\mathcal{M}, and by the global period length of all ''arc reversals'' in M\mathcal{M}. The new prime number generation method inherits the distributed and parallel nature of SMER and requires at most n+nn + \lfloor \sqrt{n}\rfloor time steps

    A distributed prime sieving algorithm based on SMER

    No full text
    Rapport interne LIPNIn this paper, we propose a fully distributed algorithm for finding all primes in an given interval [2..n][2..n] (or (L,R)(L,R), more generally), based on the SMER (\textit{Scheduling by Multiple Edge Reversal}) multigraph dynamics. Given a multigraph M\mathcal{M} of arbitrary topology, having NN nodes, the SMER-driven system is defined by the numberof directed edges (arcs) between any two nodes of M\mathcal{M}, and by the global period length of all ``arc reversals'' in M\mathcal{M}. In the domain of prime numbers generation, such a graph method shows quite elegant, and it also yields a totally new kind of distributed prime sieving algorithms of an entirely original design. The maximum number of steps required by the algorithm is at most n+nn + \sqrt{n}. Although far beyond the O(n/loglogn)O(n/\log\log n) steps required by the improved sequential ``wheel sieve'' algorithms, our SMER-based algorithm is fully distributed and of linear (step) complexity. The message complexity of the algorithm is at most nΔN+nΔNn\Delta_N + \sqrt{n} \Delta_N, where ΔN\Delta_N denotes the maximum ``multidegree'' of the arbitrary multigraph M\mathcal{M}, and the space required per process is linear
    corecore