17 research outputs found

    Dictionary Matching with One Gap

    Full text link
    The dictionary matching with gaps problem is to preprocess a dictionary DD of dd gapped patterns P1,,PdP_1,\ldots,P_d over alphabet Σ\Sigma, where each gapped pattern PiP_i is a sequence of subpatterns separated by bounded sequences of don't cares. Then, given a query text TT of length nn over alphabet Σ\Sigma, the goal is to output all locations in TT in which a pattern PiDP_i\in D, 1id1\leq i\leq d, ends. There is a renewed current interest in the gapped matching problem stemming from cyber security. In this paper we solve the problem where all patterns in the dictionary have one gap with at least α\alpha and at most β\beta don't cares, where α\alpha and β\beta are given parameters. Specifically, we show that the dictionary matching with a single gap problem can be solved in either O(dlogd+D)O(d\log d + |D|) time and O(dlogεd+D)O(d\log^{\varepsilon} d + |D|) space, and query time O(n(βα)loglogdlog2min{d,logD}+occ)O(n(\beta -\alpha )\log\log d \log ^2 \min \{ d, \log |D| \} + occ), where occocc is the number of patterns found, or preprocessing time and space: O(d2+D)O(d^2 + |D|), and query time O(n(βα)+occ)O(n(\beta -\alpha ) + occ), where occocc is the number of patterns found. As far as we know, this is the best solution for this setting of the problem, where many overlaps may exist in the dictionary.Comment: A preliminary version was published at CPM 201

    Mind the Gap: Essentially Optimal Algorithms for Online Dictionary Matching with One Gap

    Get PDF
    We examine the complexity of the online Dictionary Matching with One Gap Problem (DMOG) which is the following. Preprocess a dictionary D of d patterns, where each pattern contains a special gap symbol that can match any string, so that given a text that arrives online, a character at a time, we can report all of the patterns from D that are suffixes of the text that has arrived so far, before the next character arrives. In more general versions the gap symbols are associated with bounds determining the possible lengths of matching strings. Online DMOG captures the difficulty in a bottleneck procedure for cyber-security, as many digital signatures of viruses manifest themselves as patterns with a single gap. In this paper, we demonstrate that the difficulty in obtaining efficient solutions for the DMOG problem, even in the offline setting, can be traced back to the infamous 3SUM conjecture. We show a conditional lower bound of Omega(delta(G_D)+op) time per text character, where G_D is a bipartite graph that captures the structure of D, delta(G_D) is the degeneracy of this graph, and op is the output size. Moreover, we show a conditional lower bound in terms of the magnitude of gaps for the bounded case, thereby showing that some known offline upper bounds are essentially optimal. We also provide matching upper-bounds (up to sub-polynomial factors), in terms of the degeneracy, for the online DMOG problem. In particular, we introduce algorithms whose time cost depends linearly on delta(G_D). Our algorithms make use of graph orientations, together with some additional techniques. These algorithms are of practical interest since although delta(G_D) can be as large as sqrt(d), and even larger if G_D is a multi-graph, it is typically a very small constant in practice. Finally, when delta(G_D) is large we are able to obtain even more efficient solutions

    Data Structure Lower Bounds for Document Indexing Problems

    Get PDF
    We study data structure problems related to document indexing and pattern matching queries and our main contribution is to show that the pointer machine model of computation can be extremely useful in proving high and unconditional lower bounds that cannot be obtained in any other known model of computation with the current techniques. Often our lower bounds match the known space-query time trade-off curve and in fact for all the problems considered, there is a very good and reasonable match between the our lower bounds and the known upper bounds, at least for some choice of input parameters. The problems that we consider are set intersection queries (both the reporting variant and the semi-group counting variant), indexing a set of documents for two-pattern queries, or forbidden- pattern queries, or queries with wild-cards, and indexing an input set of gapped-patterns (or two-patterns) to find those matching a document given at the query time.Comment: Full version of the conference version that appeared at ICALP 2016, 25 page

    Upper and lower bounds for dynamic data structures on strings

    Get PDF
    We consider a range of simply stated dynamic data structure problems on strings. An update changes one symbol in the input and a query asks us to compute some function of the pattern of length mm and a substring of a longer text. We give both conditional and unconditional lower bounds for variants of exact matching with wildcards, inner product, and Hamming distance computation via a sequence of reductions. As an example, we show that there does not exist an O(m1/2ε)O(m^{1/2-\varepsilon}) time algorithm for a large range of these problems unless the online Boolean matrix-vector multiplication conjecture is false. We also provide nearly matching upper bounds for most of the problems we consider.Comment: Accepted at STACS'1

    Improved Bounds for 3SUM, kk-SUM, and Linear Degeneracy

    Get PDF
    Given a set of nn real numbers, the 3SUM problem is to decide whether there are three of them that sum to zero. Until a recent breakthrough by Gr{\o}nlund and Pettie [FOCS'14], a simple Θ(n2)\Theta(n^2)-time deterministic algorithm for this problem was conjectured to be optimal. Over the years many algorithmic problems have been shown to be reducible from the 3SUM problem or its variants, including the more generalized forms of the problem, such as kk-SUM and kk-variate linear degeneracy testing (kk-LDT). The conjectured hardness of these problems have become extremely popular for basing conditional lower bounds for numerous algorithmic problems in P. In this paper, we show that the randomized 44-linear decision tree complexity of 3SUM is O(n3/2)O(n^{3/2}), and that the randomized (2k2)(2k-2)-linear decision tree complexity of kk-SUM and kk-LDT is O(nk/2)O(n^{k/2}), for any odd k3k\ge 3. These bounds improve (albeit randomized) the corresponding O(n3/2logn)O(n^{3/2}\sqrt{\log n}) and O(nk/2logn)O(n^{k/2}\sqrt{\log n}) decision tree bounds obtained by Gr{\o}nlund and Pettie. Our technique includes a specialized randomized variant of fractional cascading data structure. Additionally, we give another deterministic algorithm for 3SUM that runs in O(n2loglogn/logn)O(n^2 \log\log n / \log n ) time. The latter bound matches a recent independent bound by Freund [Algorithmica 2017], but our algorithm is somewhat simpler, due to a better use of word-RAM model

    Deterministic 3SUM-Hardness

    Full text link
    As one of the three main pillars of fine-grained complexity theory, the 3SUM problem explains the hardness of many diverse polynomial-time problems via fine-grained reductions. Many of these reductions are either directly based on or heavily inspired by P\u{a}tra\c{s}cu's framework involving additive hashing and are thus randomized. Some selected reductions were derandomized in previous work [Chan, He; SOSA'20], but the current techniques are limited and a major fraction of the reductions remains randomized. In this work we gather a toolkit aimed to derandomize reductions based on additive hashing. Using this toolkit, we manage to derandomize almost all known 3SUM-hardness reductions. As technical highlights we derandomize the hardness reductions to (offline) Set Disjointness, (offline) Set Intersection and Triangle Listing -- these questions were explicitly left open in previous work [Kopelowitz, Pettie, Porat; SODA'16]. The few exceptions to our work fall into a special category of recent reductions based on structure-versus-randomness dichotomies. We expect that our toolkit can be readily applied to derandomize future reductions as well. As a conceptual innovation, our work thereby promotes the theory of deterministic 3SUM-hardness. As our second contribution, we prove that there is a deterministic universe reduction for 3SUM. Specifically, using additive hashing it is a standard trick to assume that the numbers in 3SUM have size at most n3n^3. We prove that this assumption is similarly valid for deterministic algorithms.Comment: To appear at ITCS 202

    Conditional Lower Bounds for Dynamic Geometric Measure Problems

    Get PDF

    Fully Dynamic Spanners with Worst-Case Update Time

    Get PDF
    An alpha-spanner of a graph G is a subgraph H such that H preserves all distances of G within a factor of alpha. In this paper, we give fully dynamic algorithms for maintaining a spanner H of a graph G undergoing edge insertions and deletions with worst-case guarantees on the running time after each update. In particular, our algorithms maintain: - a 3-spanner with ~O(n^{1+1/2}) edges with worst-case update time ~O(n^{3/4}), or - a 5-spanner with ~O(n^{1+1/3}) edges with worst-case update time ~O (n^{5/9}). These size/stretch tradeoffs are best possible (up to logarithmic factors). They can be extended to the weighted setting at very minor cost. Our algorithms are randomized and correct with high probability against an oblivious adversary. We also further extend our techniques to construct a 5-spanner with suboptimal size/stretch tradeoff, but improved worst-case update time. To the best of our knowledge, these are the first dynamic spanner algorithms with sublinear worst-case update time guarantees. Since it is known how to maintain a spanner using small amortized}but large worst-case update time [Baswana et al. SODA\u2708], obtaining algorithms with strong worst-case bounds, as presented in this paper, seems to be the next natural step for this problem
    corecore