147 research outputs found

    Pseudorandom Hashing for Space-bounded Computation with Applications in Streaming

    Full text link
    We revisit Nisan's classical pseudorandom generator (PRG) for space-bounded computation (STOC 1990) and its applications in streaming algorithms. We describe a new generator, HashPRG, that can be thought of as a symmetric version of Nisan's generator over larger alphabets. Our generator allows a trade-off between seed length and the time needed to compute a given block of the generator's output. HashPRG can be used to obtain derandomizations with much better update time and \emph{without sacrificing space} for a large number of data stream algorithms, such as FpF_p estimation in the parameter regimes p>2p > 2 and 0<p<20 < p < 2 and CountSketch with tight estimation guarantees as analyzed by Minton and Price (SODA 2014) which assumed access to a random oracle. We also show a recent analysis of Private CountSketch can be derandomized using our techniques. For a dd-dimensional vector xx being updated in a turnstile stream, we show that x\|x\|_{\infty} can be estimated up to an additive error of εx2\varepsilon\|x\|_{2} using O(ε2log(1/ε)logd)O(\varepsilon^{-2}\log(1/\varepsilon)\log d) bits of space. Additionally, the update time of this algorithm is O(log1/ε)O(\log 1/\varepsilon) in the Word RAM model. We show that the space complexity of this algorithm is optimal up to constant factors. However, for vectors xx with x=Θ(x2)\|x\|_{\infty} = \Theta(\|x\|_{2}), we show that the lower bound can be broken by giving an algorithm that uses O(ε2logd)O(\varepsilon^{-2}\log d) bits of space which approximates x\|x\|_{\infty} up to an additive error of εx2\varepsilon\|x\|_{2}. We use our aforementioned derandomization of the CountSketch data structure to obtain this algorithm, and using the time-space trade off of HashPRG, we show that the update time of this algorithm is also O(log1/ε)O(\log 1/\varepsilon) in the Word RAM model.Comment: Minor writing improvement

    Efficient Dynamic Approximate Distance Oracles for Vertex-Labeled Planar Graphs

    Full text link
    Let GG be a graph where each vertex is associated with a label. A Vertex-Labeled Approximate Distance Oracle is a data structure that, given a vertex vv and a label λ\lambda, returns a (1+ε)(1+\varepsilon)-approximation of the distance from vv to the closest vertex with label λ\lambda in GG. Such an oracle is dynamic if it also supports label changes. In this paper we present three different dynamic approximate vertex-labeled distance oracles for planar graphs, all with polylogarithmic query and update times, and nearly linear space requirements

    On the k-Independence Required by Linear Probing and Minwise Independence

    Full text link

    Efficiently Correcting Matrix Products

    Get PDF
    We study the problem of efficiently correcting an erroneous product of two n×nn\times n matrices over a ring. Among other things, we provide a randomized algorithm for correcting a matrix product with at most kk erroneous entries running in O~(n2+kn)\tilde{O}(n^2+kn) time and a deterministic O~(kn2)\tilde{O}(kn^2)-time algorithm for this problem (where the notation O~\tilde{O} suppresses polylogarithmic terms in nn and kk).Comment: Fixed invalid reference to figure in v

    Wear Minimization for Cuckoo Hashing: How Not to Throw a Lot of Eggs into One Basket

    Full text link
    We study wear-leveling techniques for cuckoo hashing, showing that it is possible to achieve a memory wear bound of loglogn+O(1)\log\log n+O(1) after the insertion of nn items into a table of size CnCn for a suitable constant CC using cuckoo hashing. Moreover, we study our cuckoo hashing method empirically, showing that it significantly improves on the memory wear performance for classic cuckoo hashing and linear probing in practice.Comment: 13 pages, 1 table, 7 figures; to appear at the 13th Symposium on Experimental Algorithms (SEA 2014

    Dynamic Compressed Strings with Random Access

    Full text link
    We consider the problem of storing a string S in dynamic compressed form, while permitting operations directly on the compressed representation of S: access a substring of S; replace, insert or delete a symbol in S; count how many occurrences of a given symbol appear in any given prefix of S (called rank operation) and locate the position of the ith occurrence of a symbol inside S (called select operation). We discuss the time complexity of several combinations of these operations along with the entropy space bounds of the corresponding compressed indexes. In this way, we extend or improve the bounds of previous work by Ferragina and Venturini [TCS, 2007], Jansson et al. [ICALP, 2012], and Nekrich and Navarro [SODA, 2013]
    corecore