103 research outputs found

    If the Current Clique Algorithms are Optimal, so is Valiant's Parser

    Full text link
    The CFG recognition problem is: given a context-free grammar G\mathcal{G} and a string ww of length nn, decide if ww can be obtained from G\mathcal{G}. This is the most basic parsing question and is a core computer science problem. Valiant's parser from 1975 solves the problem in O(nω)O(n^{\omega}) time, where ω<2.373\omega<2.373 is the matrix multiplication exponent. Dozens of parsing algorithms have been proposed over the years, yet Valiant's upper bound remains unbeaten. The best combinatorial algorithms have mildly subcubic O(n3/log3n)O(n^3/\log^3{n}) complexity. Lee (JACM'01) provided evidence that fast matrix multiplication is needed for CFG parsing, and that very efficient and practical algorithms might be hard or even impossible to obtain. Lee showed that any algorithm for a more general parsing problem with running time O(Gn3ε)O(|\mathcal{G}|\cdot n^{3-\varepsilon}) can be converted into a surprising subcubic algorithm for Boolean Matrix Multiplication. Unfortunately, Lee's hardness result required that the grammar size be G=Ω(n6)|\mathcal{G}|=\Omega(n^6). Nothing was known for the more relevant case of constant size grammars. In this work, we prove that any improvement on Valiant's algorithm, even for constant size grammars, either in terms of runtime or by avoiding the inefficiencies of fast matrix multiplication, would imply a breakthrough algorithm for the kk-Clique problem: given a graph on nn nodes, decide if there are kk that form a clique. Besides classifying the complexity of a fundamental problem, our reduction has led us to similar lower bounds for more modern and well-studied cubic time problems for which faster algorithms are highly desirable in practice: RNA Folding, a central problem in computational biology, and Dyck Language Edit Distance, answering an open question of Saha (FOCS'14)

    Faster Min-Plus Product for Monotone Instances

    Full text link
    In this paper, we show that the time complexity of monotone min-plus product of two n×nn\times n matrices is O~(n(3+ω)/2)=O~(n2.687)\tilde{O}(n^{(3+\omega)/2})=\tilde{O}(n^{2.687}), where ω<2.373\omega < 2.373 is the fast matrix multiplication exponent [Alman and Vassilevska Williams 2021]. That is, when AA is an arbitrary integer matrix and BB is either row-monotone or column-monotone with integer elements bounded by O(n)O(n), computing the min-plus product CC where Ci,j=mink{Ai,k+Bk,j}C_{i,j}=\min_k\{A_{i,k}+B_{k,j}\} takes O~(n(3+ω)/2)\tilde{O}(n^{(3+\omega)/2}) time, which greatly improves the previous time bound of O~(n(12+ω)/5)=O~(n2.875)\tilde{O}(n^{(12+\omega)/5})=\tilde{O}(n^{2.875}) [Gu, Polak, Vassilevska Williams and Xu 2021]. Then by simple reductions, this means the following problems also have O~(n(3+ω)/2)\tilde{O}(n^{(3+\omega)/2}) time algorithms: (1) AA and BB are both bounded-difference, that is, the difference between any two adjacent entries is a constant. The previous results give time complexities of O~(n2.824)\tilde{O}(n^{2.824}) [Bringmann, Grandoni, Saha and Vassilevska Williams 2016] and O~(n2.779)\tilde{O}(n^{2.779}) [Chi, Duan and Xie 2022]. (2) AA is arbitrary and the columns or rows of BB are bounded-difference. Previous result gives time complexity of O~(n2.922)\tilde{O}(n^{2.922}) [Bringmann, Grandoni, Saha and Vassilevska Williams 2016]. (3) The problems reducible to these problems, such as language edit distance, RNA-folding, scored parsing problem on BD grammars. [Bringmann, Grandoni, Saha and Vassilevska Williams 2016]. Finally, we also consider the problem of min-plus convolution between two integral sequences which are monotone and bounded by O(n)O(n), and achieve a running time upper bound of O~(n1.5)\tilde{O}(n^{1.5}). Previously, this task requires running time O~(n(9+177)/12)=O(n1.859)\tilde{O}(n^{(9+\sqrt{177})/12}) = O(n^{1.859}) [Chan and Lewenstein 2015].Comment: 26 page

    Fine-Grained Complexity of Analyzing Compressed Data: Quantifying Improvements over Decompress-And-Solve

    No full text
    Can we analyze data without decompressing it? As our data keeps growing, understanding the time complexity of problems on compressed inputs, rather than in convenient uncompressed forms, becomes more and more relevant. Suppose we are given a compression of size nn of data that originally has size NN, and we want to solve a problem with time complexity T()T(\cdot). The naive strategy of "decompress-and-solve" gives time T(N)T(N), whereas "the gold standard" is time T(n)T(n): to analyze the compression as efficiently as if the original data was small. We restrict our attention to data in the form of a string (text, files, genomes, etc.) and study the most ubiquitous tasks. While the challenge might seem to depend heavily on the specific compression scheme, most methods of practical relevance (Lempel-Ziv-family, dictionary methods, and others) can be unified under the elegant notion of Grammar Compressions. A vast literature, across many disciplines, established this as an influential notion for Algorithm design. We introduce a framework for proving (conditional) lower bounds in this field, allowing us to assess whether decompress-and-solve can be improved, and by how much. Our main results are: - The O(nNlogN/n)O(nN\sqrt{\log{N/n}}) bound for LCS and the O(min{NlogN,nM})O(\min\{N \log N, nM\}) bound for Pattern Matching with Wildcards are optimal up to No(1)N^{o(1)} factors, under the Strong Exponential Time Hypothesis. (Here, MM denotes the uncompressed length of the compressed pattern.) - Decompress-and-solve is essentially optimal for Context-Free Grammar Parsing and RNA Folding, under the kk-Clique conjecture. - We give an algorithm showing that decompress-and-solve is not optimal for Disjointness

    Fine-Grained Complexity of Analyzing Compressed Data: Quantifying Improvements over Decompress-And-Solve

    Get PDF
    Can we analyze data without decompressing it? As our data keeps growing, understanding the time complexity of problems on compressed inputs, rather than in convenient uncompressed forms, becomes more and more relevant. Suppose we are given a compression of size nn of data that originally has size NN, and we want to solve a problem with time complexity T()T(\cdot). The naive strategy of "decompress-and-solve" gives time T(N)T(N), whereas "the gold standard" is time T(n)T(n): to analyze the compression as efficiently as if the original data was small. We restrict our attention to data in the form of a string (text, files, genomes, etc.) and study the most ubiquitous tasks. While the challenge might seem to depend heavily on the specific compression scheme, most methods of practical relevance (Lempel-Ziv-family, dictionary methods, and others) can be unified under the elegant notion of Grammar Compressions. A vast literature, across many disciplines, established this as an influential notion for Algorithm design. We introduce a framework for proving (conditional) lower bounds in this field, allowing us to assess whether decompress-and-solve can be improved, and by how much. Our main results are: - The O(nNlogN/n)O(nN\sqrt{\log{N/n}}) bound for LCS and the O(min{NlogN,nM})O(\min\{N \log N, nM\}) bound for Pattern Matching with Wildcards are optimal up to No(1)N^{o(1)} factors, under the Strong Exponential Time Hypothesis. (Here, MM denotes the uncompressed length of the compressed pattern.) - Decompress-and-solve is essentially optimal for Context-Free Grammar Parsing and RNA Folding, under the kk-Clique conjecture. - We give an algorithm showing that decompress-and-solve is not optimal for Disjointness.Comment: Presented at FOCS'17. Full version. 63 page
    corecore