14 research outputs found

    Multi-Agent Programming Contest 2011 - The Python-DTU Team

    Get PDF
    We provide a brief description of the Python-DTU system, including the overall design, the tools and the algorithms that we plan to use in the agent contest.Comment: 4 page

    Time-Space Trade-Offs for Lempel-Ziv Compressed Indexing

    Get PDF
    Given a string S, the compressed indexing problem is to preprocess S into a compressed representation that supports fast substring queries. The goal is to use little space relative to the compressed size of S while supporting fast queries. We present a compressed index based on the Lempel-Ziv 1977 compression scheme. Let n, and z denote the size of the input string, and the compressed LZ77 string, respectively. We obtain the following time-space trade-offs. Given a pattern string P of length m, we can solve the problem in (i) O(m + occ lglg n) time using O(z lg(n/z) lglg z) space, or (ii) O(m(1 + lg^e z / lg(n/z)) + occ(lglg n + lg^e z)) time using O(z lg(n/z)) space, for any 0 < e < 1 In particular, (i) improves the leading term in the query time of the previous best solution from O(m lg m) to O(m) at the cost of increasing the space by a factor lglg z. Alternatively, (ii) matches the previous best space bound, but has a leading term in the query time of O(m(1+lg^e z / lg(n/z))). However, for any polynomial compression ratio, i.e., z = O(n^{1-d}), for constant d > 0, this becomes O(m). Our index also supports extraction of any substring of length l in O(l + lg(n/z)) time. Technically, our results are obtained by novel extensions and combinations of existing data structures of independent interest, including a new batched variant of weak prefix search

    Implementing a Multi-Agent System in Python

    Get PDF

    Fast Dynamic Arrays

    Get PDF
    We present a highly optimized implementation of tiered vectors, a data structure for maintaining a sequence of nn elements supporting access in time O(1)O(1) and insertion and deletion in time O(nϵ)O(n^\epsilon) for ϵ>0\epsilon > 0 while using o(n)o(n) extra space. We consider several different implementation optimizations in C++ and compare their performance to that of vector and multiset from the standard library on sequences with up to 10810^8 elements. Our fastest implementation uses much less space than multiset while providing speedups of 40×40\times for access operations compared to multiset and speedups of 10.000×10.000\times compared to vector for insertion and deletion operations while being competitive with both data structures for all other operations

    Fast Dynamic Arrays

    Get PDF
    We present a highly optimized implementation of tiered vectors, a data structure for maintaining a sequence of n elements supporting access in time O(1) and insertion and deletion in time O(n^e) for e > 0 while using o(n) extra space. We consider several different implementation optimizations in C++ and compare their performance to that of vector and set from the standard library on sequences with up to 10^8 elements. Our fastest implementation uses much less space than set while providing speedups of 40x for access operations compared to set and speedups of 10.000x compared to vector for insertion and deletion operations while being competitive with both data structures for all other operations

    Compressed Indexing with Signature Grammars

    Get PDF
    The compressed indexing problem is to preprocess a string SS of length nn into a compressed representation that supports pattern matching queries. That is, given a string PP of length mm report all occurrences of PP in SS. We present a data structure that supports pattern matching queries in O(m+occ(lg⁡lg⁡n+lg⁡ϵz))O(m + occ (\lg\lg n + \lg^\epsilon z)) time using O(zlg⁡(n/z))O(z \lg(n / z)) space where zz is the size of the LZ77 parse of SS and ϵ>0\epsilon > 0 is an arbitrarily small constant, when the alphabet is small or z=O(n1−δ)z = O(n^{1 - \delta}) for any constant δ>0\delta > 0. We also present two data structures for the general case; one where the space is increased by O(zlg⁡lg⁡z)O(z\lg\lg z), and one where the query time changes from worst-case to expected. These results improve the previously best known solutions. Notably, this is the first data structure that decides if PP occurs in SS in O(m)O(m) time using O(zlg⁡(n/z))O(z\lg(n/z)) space. Our results are mainly obtained by a novel combination of a randomized grammar construction algorithm with well known techniques relating pattern matching to 2D-range reporting

    Time-space trade-offs for lempel-ziv compressed indexing

    Get PDF
    Given a string SS, the \emph{compressed indexing problem} is to preprocess SS into a compressed representation that supports fast \emph{substring queries}. The goal is to use little space relative to the compressed size of SS while supporting fast queries. We present a compressed index based on the Lempel--Ziv 1977 compression scheme. We obtain the following time-space trade-offs: For constant-sized alphabets; (i) O(m+occlg⁡lg⁡n)O(m + occ \lg\lg n) time using O(zlg⁡(n/z)lg⁡lg⁡z)O(z\lg(n/z)\lg\lg z) space, or (ii) O(m(1+lg⁡ϵzlg⁡(n/z))+occ(lg⁡lg⁡n+lg⁡ϵz))O(m(1 + \frac{\lg^\epsilon z}{\lg(n/z)}) + occ(\lg\lg n + \lg^\epsilon z)) time using O(zlg⁡(n/z))O(z\lg(n/z)) space. For integer alphabets polynomially bounded by nn; (iii) O(m(1+lg⁡ϵzlg⁡(n/z))+occ(lg⁡lg⁡n+lg⁡ϵz))O(m(1 + \frac{\lg^\epsilon z}{\lg(n/z)}) + occ(\lg\lg n + \lg^\epsilon z)) time using O(z(lg⁡(n/z)+lg⁡lg⁡z))O(z(\lg(n/z) + \lg\lg z)) space, or (iv) O(m+occ(lg⁡lg⁡n+lg⁡ϵz))O(m + occ(\lg\lg n + \lg^{\epsilon} z)) time using O(z(lg⁡(n/z)+lg⁡ϵz))O(z(\lg(n/z) + \lg^{\epsilon} z)) space, where nn and mm are the length of the input string and query string respectively, zz is the number of phrases in the LZ77 parse of the input string, occocc is the number of occurrences of the query in the input and ϵ>0\epsilon > 0 is an arbitrarily small constant. In particular, (i) improves the leading term in the query time of the previous best solution from O(mlg⁡m)O(m\lg m) to O(m)O(m) at the cost of increasing the space by a factor lg⁡lg⁡z\lg \lg z. Alternatively, (ii) matches the previous best space bound, but has a leading term in the query time of O(m(1+lg⁡ϵzlg⁡(n/z)))O(m(1+\frac{\lg^{\epsilon} z}{\lg (n/z)})). However, for any polynomial compression ratio, i.e., z=O(n1−δ)z = O(n^{1-\delta}), for constant δ>0\delta > 0, this becomes O(m)O(m). Our index also supports extraction of any substring of length ℓ\ell in O(ℓ+lg⁡(n/z))O(\ell + \lg(n/z)) time. Technically, our results are obtained by novel extensions and combinations of existing data structures of independent interest, including a new batched variant of weak prefix search

    Optimal-Time Dictionary-Compressed Indexes

    Full text link
    We describe the first self-indexes able to count and locate pattern occurrences in optimal time within a space bounded by the size of the most popular dictionary compressors. To achieve this result we combine several recent findings, including \emph{string attractors} --- new combinatorial objects encompassing most known compressibility measures for highly repetitive texts ---, and grammars based on \emph{locally-consistent parsing}. More in detail, let γ\gamma be the size of the smallest attractor for a text TT of length nn. The measure γ\gamma is an (asymptotic) lower bound to the size of dictionary compressors based on Lempel--Ziv, context-free grammars, and many others. The smallest known text representations in terms of attractors use space O(γlog⁡(n/γ))O(\gamma\log(n/\gamma)), and our lightest indexes work within the same asymptotic space. Let ϵ>0\epsilon>0 be a suitably small constant fixed at construction time, mm be the pattern length, and occocc be the number of its text occurrences. Our index counts pattern occurrences in O(m+log⁡2+ϵn)O(m+\log^{2+\epsilon}n) time, and locates them in O(m+(occ+1)log⁡ϵn)O(m+(occ+1)\log^\epsilon n) time. These times already outperform those of most dictionary-compressed indexes, while obtaining the least asymptotic space for any index searching within O((m+occ) polylog n)O((m+occ)\,\textrm{polylog}\,n) time. Further, by increasing the space to O(γlog⁡(n/γ)log⁡ϵn)O(\gamma\log(n/\gamma)\log^\epsilon n), we reduce the locating time to the optimal O(m+occ)O(m+occ), and within O(γlog⁡(n/γ)log⁡n)O(\gamma\log(n/\gamma)\log n) space we can also count in optimal O(m)O(m) time. No dictionary-compressed index had obtained this time before. All our indexes can be constructed in O(n)O(n) space and O(nlog⁡n)O(n\log n) expected time. As a byproduct of independent interest..

    Decompressing Lempel-Ziv Compressed Text

    Full text link
    We consider the problem of decompressing the Lempel--Ziv 77 representation of a string SS of length nn using a working space as close as possible to the size zz of the input. The folklore solution for the problem runs in O(n)O(n) time but requires random access to the whole decompressed text. Another folklore solution is to convert LZ77 into a grammar of size O(zlog⁡(n/z))O(z\log(n/z)) and then stream SS in linear time. In this paper, we show that O(n)O(n) time and O(z)O(z) working space can be achieved for constant-size alphabets. On general alphabets of size σ\sigma, we describe (i) a trade-off achieving O(nlog⁡δσ)O(n\log^\delta \sigma) time and O(zlog⁡1−δσ)O(z\log^{1-\delta}\sigma) space for any 0≤δ≤10\leq \delta\leq 1, and (ii) a solution achieving O(n)O(n) time and O(zlog⁡log⁡(n/z))O(z\log\log (n/z)) space. The latter solution, in particular, dominates both folklore algorithms for the problem. Our solutions can, more generally, extract any specified subsequence of SS with little overheads on top of the linear running time and working space. As an immediate corollary, we show that our techniques yield improved results for pattern matching problems on LZ77-compressed text
    corecore