35 research outputs found

    Sorting a Low-Entropy Sequence

    Full text link
    We give the first sorting algorithm with bounds in terms of higher-order entropies: let SS be a sequence of length mm containing nn distinct elements and let (H_\ell (S)) be the \ellth-order empirical entropy of SS, with (n^{\ell + 1} \log n \in O (m)); our algorithm sorts SS using ((H_\ell (S) + O (1)) m) comparisons

    Upper and lower bounds for dynamic data structures on strings

    Get PDF
    We consider a range of simply stated dynamic data structure problems on strings. An update changes one symbol in the input and a query asks us to compute some function of the pattern of length mm and a substring of a longer text. We give both conditional and unconditional lower bounds for variants of exact matching with wildcards, inner product, and Hamming distance computation via a sequence of reductions. As an example, we show that there does not exist an O(m1/2ε)O(m^{1/2-\varepsilon}) time algorithm for a large range of these problems unless the online Boolean matrix-vector multiplication conjecture is false. We also provide nearly matching upper bounds for most of the problems we consider.Comment: Accepted at STACS'1

    Dynamic Relative Compression, Dynamic Partial Sums, and Substring Concatenation

    Get PDF
    Given a static reference string RR and a source string SS, a relative compression of SS with respect to RR is an encoding of SS as a sequence of references to substrings of RR. Relative compression schemes are a classic model of compression and have recently proved very successful for compressing highly-repetitive massive data sets such as genomes and web-data. We initiate the study of relative compression in a dynamic setting where the compressed source string SS is subject to edit operations. The goal is to maintain the compressed representation compactly, while supporting edits and allowing efficient random access to the (uncompressed) source string. We present new data structures that achieve optimal time for updates and queries while using space linear in the size of the optimal relative compression, for nearly all combinations of parameters. We also present solutions for restricted and extended sets of updates. To achieve these results, we revisit the dynamic partial sums problem and the substring concatenation problem. We present new optimal or near optimal bounds for these problems. Plugging in our new results we also immediately obtain new bounds for the string indexing for patterns with wildcards problem and the dynamic text and static pattern matching problem

    Cell-probe Lower Bounds for Dynamic Problems via a New Communication Model

    Full text link
    In this paper, we develop a new communication model to prove a data structure lower bound for the dynamic interval union problem. The problem is to maintain a multiset of intervals I\mathcal{I} over [0,n][0, n] with integer coordinates, supporting the following operations: - insert(a, b): add an interval [a,b][a, b] to I\mathcal{I}, provided that aa and bb are integers in [0,n][0, n]; - delete(a, b): delete a (previously inserted) interval [a,b][a, b] from I\mathcal{I}; - query(): return the total length of the union of all intervals in I\mathcal{I}. It is related to the two-dimensional case of Klee's measure problem. We prove that there is a distribution over sequences of operations with O(n)O(n) insertions and deletions, and O(n0.01)O(n^{0.01}) queries, for which any data structure with any constant error probability requires Ω(nlogn)\Omega(n\log n) time in expectation. Interestingly, we use the sparse set disjointness protocol of H\aa{}stad and Wigderson [ToC'07] to speed up a reduction from a new kind of nondeterministic communication games, for which we prove lower bounds. For applications, we prove lower bounds for several dynamic graph problems by reducing them from dynamic interval union

    Lower Bounds for Oblivious Data Structures

    Get PDF
    An oblivious data structure is a data structure where the memory access patterns reveals no information about the operations performed on it. Such data structures were introduced by Wang et al. [ACM SIGSAC'14] and are intended for situations where one wishes to store the data structure at an untrusted server. One way to obtain an oblivious data structure is simply to run a classic data structure on an oblivious RAM (ORAM). Until very recently, this resulted in an overhead of ω(lgn)\omega(\lg n) for the most natural setting of parameters. Moreover, a recent lower bound for ORAMs by Larsen and Nielsen [CRYPTO'18] show that they always incur an overhead of at least Ω(lgn)\Omega(\lg n) if used in a black box manner. To circumvent the ω(lgn)\omega(\lg n) overhead, researchers have instead studied classic data structure problems more directly and have obtained efficient solutions for many such problems such as stacks, queues, deques, priority queues and search trees. However, none of these data structures process operations faster than Θ(lgn)\Theta(\lg n), leaving open the question of whether even faster solutions exist. In this paper, we rule out this possibility by proving Ω(lgn)\Omega(\lg n) lower bounds for oblivious stacks, queues, deques, priority queues and search trees.Comment: To appear at SODA'1

    Dynamic Relative Compression, Dynamic Partial Sums, and Substring Concatenation

    Get PDF
    Given a static reference string R and a source string S, a relative compression of S with respect to R is an encoding of S as a sequence of references to substrings of R. Relative compression schemes are a classic model of compression and have recently proved very successful for compressing highly-repetitive massive data sets such as genomes and web-data. We initiate the study of relative compression in a dynamic setting where the compressed source string S is subject to edit operations. The goal is to maintain the compressed representation compactly, while supporting edits and allowing efficient random access to the (uncompressed) source string. We present new data structures that achieve optimal time for updates and queries while using space linear in the size of the optimal relative compression, for nearly all combinations of parameters. We also present solutions for restricted and extended sets of updates. To achieve these results, we revisit the dynamic partial sums problem and the substring concatenation problem. We present new optimal or near optimal bounds for these problems. Plugging in our new results we also immediately obtain new bounds for the string indexing for patterns with wildcards problem and the dynamic text and static pattern matching problem

    Lower Bounds for Semi-adaptive Data Structures via Corruption

    Get PDF
    corecore