39,985 research outputs found

    Optimal Substring-Equality Queries with Applications to Sparse Text Indexing

    Full text link
    We consider the problem of encoding a string of length nn from an integer alphabet of size σ\sigma so that access and substring equality queries (that is, determining the equality of any two substrings) can be answered efficiently. Any uniquely-decodable encoding supporting access must take nlogσ+Θ(log(nlogσ))n\log\sigma + \Theta(\log (n\log\sigma)) bits. We describe a new data structure matching this lower bound when σnO(1)\sigma\leq n^{O(1)} while supporting both queries in optimal O(1)O(1) time. Furthermore, we show that the string can be overwritten in-place with this structure. The redundancy of Θ(logn)\Theta(\log n) bits and the constant query time break exponentially a lower bound that is known to hold in the read-only model. Using our new string representation, we obtain the first in-place subquadratic (indeed, even sublinear in some cases) algorithms for several string-processing problems in the restore model: the input string is rewritable and must be restored before the computation terminates. In particular, we describe the first in-place subquadratic Monte Carlo solutions to the sparse suffix sorting, sparse LCP array construction, and suffix selection problems. With the sole exception of suffix selection, our algorithms are also the first running in sublinear time for small enough sets of input suffixes. Combining these solutions, we obtain the first sublinear-time Monte Carlo algorithm for building the sparse suffix tree in compact space. We also show how to derandomize our algorithms using small space. This leads to the first Las Vegas in-place algorithm computing the full LCP array in O(nlogn)O(n\log n) time and to the first Las Vegas in-place algorithms solving the sparse suffix sorting and sparse LCP array construction problems in O(n1.5logσ)O(n^{1.5}\sqrt{\log \sigma}) time. Running times of these Las Vegas algorithms hold in the worst case with high probability.Comment: Refactored according to TALG's reviews. New w.h.p. bounds and Las Vegas algorithm

    Universally Decodable Matrices for Distributed Matrix-Vector Multiplication

    Get PDF
    Coded computation is an emerging research area that leverages concepts from erasure coding to mitigate the effect of stragglers (slow nodes) in distributed computation clusters, especially for matrix computation problems. In this work, we present a class of distributed matrix-vector multiplication schemes that are based on codes in the Rosenbloom-Tsfasman metric and universally decodable matrices. Our schemes take into account the inherent computation order within a worker node. In particular, they allow us to effectively leverage partial computations performed by stragglers (a feature that many prior works lack). An additional main contribution of our work is a companion matrix-based embedding of these codes that allows us to obtain sparse and numerically stable schemes for the problem at hand. Experimental results confirm the effectiveness of our techniques.Comment: 6 pages, 1 figur

    Reconstructing Rational Functions with FireFly\texttt{FireFly}

    Full text link
    We present the open-source C++\texttt{C++} library FireFly\texttt{FireFly} for the reconstruction of multivariate rational functions over finite fields. We discuss the involved algorithms and their implementation. As an application, we use FireFly\texttt{FireFly} in the context of integration-by-parts reductions and compare runtime and memory consumption to a fully algebraic approach with the program Kira\texttt{Kira}.Comment: 46 pages, 3 figures, 6 tables; v2: matches published versio

    What grid cells convey about rat location

    Get PDF
    We characterize the relationship between the simultaneously recorded quantities of rodent grid cell firing and the position of the rat. The formalization reveals various properties of grid cell activity when considered as a neural code for representing and updating estimates of the rat's location. We show that, although the spatially periodic response of grid cells appears wasteful, the code is fully combinatorial in capacity. The resulting range for unambiguous position representation is vastly greater than the ≈1–10 m periods of individual lattices, allowing for unique high-resolution position specification over the behavioral foraging ranges of rats, with excess capacity that could be used for error correction. Next, we show that the merits of the grid cell code for position representation extend well beyond capacity and include arithmetic properties that facilitate position updating. We conclude by considering the numerous implications, for downstream readouts and experimental tests, of the properties of the grid cell code
    corecore