Search CORE

892 research outputs found

If the Current Clique Algorithms are Optimal, so is Valiant's Parser

Author: Abboud Amir
Backurs Arturs
Williams Virginia Vassilevska
Publication venue
Publication date: 05/11/2015
Field of study

The CFG recognition problem is: given a context-free grammar

\mathcal{G}

and a string

w

of length

n

, decide if

w

can be obtained from

\mathcal{G}

. This is the most basic parsing question and is a core computer science problem. Valiant's parser from 1975 solves the problem in

O(n^{\omega})

time, where

\omega<2.373

is the matrix multiplication exponent. Dozens of parsing algorithms have been proposed over the years, yet Valiant's upper bound remains unbeaten. The best combinatorial algorithms have mildly subcubic

O(n^3/\log^3{n})

complexity. Lee (JACM'01) provided evidence that fast matrix multiplication is needed for CFG parsing, and that very efficient and practical algorithms might be hard or even impossible to obtain. Lee showed that any algorithm for a more general parsing problem with running time

O(|\mathcal{G}|\cdot n^{3-\varepsilon})

can be converted into a surprising subcubic algorithm for Boolean Matrix Multiplication. Unfortunately, Lee's hardness result required that the grammar size be

|\mathcal{G}|=\Omega(n^6)

. Nothing was known for the more relevant case of constant size grammars. In this work, we prove that any improvement on Valiant's algorithm, even for constant size grammars, either in terms of runtime or by avoiding the inefficiencies of fast matrix multiplication, would imply a breakthrough algorithm for the

k

-Clique problem: given a graph on

n

nodes, decide if there are

k

that form a clique. Besides classifying the complexity of a fundamental problem, our reduction has led us to similar lower bounds for more modern and well-studied cubic time problems for which faster algorithms are highly desirable in practice: RNA Folding, a central problem in computational biology, and Dyck Language Edit Distance, answering an open question of Saha (FOCS'14)

arXiv.org e-Print Archive

Crossref

Improved bounds for testing Dyck languages

Author: Fischer Eldar
Magniez Frédéric
Starikovskaya Tatiana
Publication venue
Publication date: 20/07/2017
Field of study

In this paper we consider the problem of deciding membership in Dyck languages, a fundamental family of context-free languages, comprised of well-balanced strings of parentheses. In this problem we are given a string of length

n

in the alphabet of parentheses of

m

types and must decide if it is well-balanced. We consider this problem in the property testing setting, where one would like to make the decision while querying as few characters of the input as possible. Property testing of strings for Dyck language membership for

m=1

, with a number of queries independent of the input size

n

, was provided in [Alon, Krivelevich, Newman and Szegedy, SICOMP 2001]. Property testing of strings for Dyck language membership for

m \ge 2

was first investigated in [Parnas, Ron and Rubinfeld, RSA 2003]. They showed an upper bound and a lower bound for distinguishing strings belonging to the language from strings that are far (in terms of the Hamming distance) from the language, which are respectively (up to polylogarithmic factors) the

2/3

power and the

1/11

power of the input size

n

. Here we improve the power of

n

in both bounds. For the upper bound, we introduce a recursion technique, that together with a refinement of the methods in the original work provides a test for any power of

n

larger than

2/5

. For the lower bound, we introduce a new problem called Truestring Equivalence, which is easily reducible to the

2

-type Dyck language property testing problem. For this new problem, we show a lower bound of

n

to the power of

1/5

arXiv.org e-Print Archive

Hal-Diderot

Edit Distance for Pushdown Automata

Author: Chatterjee Krishnendu
Henzinger Thomas A.
Ibsen-Jensen Rasmus
Otop Jan
Publication venue
Publication date: 01/01/2017
Field of study

The edit distance between two words

w_1, w_2

is the minimal number of word operations (letter insertions, deletions, and substitutions) necessary to transform

w_1

w_2

. The edit distance generalizes to languages

\mathcal{L}_1, \mathcal{L}_2

, where the edit distance from

\mathcal{L}_1

\mathcal{L}_2

is the minimal number

k

such that for every word from

\mathcal{L}_1

there exists a word in

\mathcal{L}_2

with edit distance at most

k

. We study the edit distance computation problem between pushdown automata and their subclasses. The problem of computing edit distance to a pushdown automaton is undecidable, and in practice, the interesting question is to compute the edit distance from a pushdown automaton (the implementation, a standard model for programs with recursion) to a regular language (the specification). In this work, we present a complete picture of decidability and complexity for the following problems: (1)~deciding whether, for a given threshold

k

, the edit distance from a pushdown automaton to a finite automaton is at most

k

, and (2)~deciding whether the edit distance from a pushdown automaton to a finite automaton is finite.Comment: An extended version of a paper accepted to ICALP 2015 with the same title. The paper has been accepted to the LMCS journa

arXiv.org e-Print Archive

Episciences.org

Directory of Open Access Journals

IST PubRep

IST Austria: PubRep (Institute of Science and Technology)

Approximating Language Edit Distance Beyond Fast Matrix Multiplication: Ultralinear Grammars Are Where Parsing Becomes Hard!

Author: Jayaram Rajesh
Saha Barna
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017)
Publication date: 01/01/2017
Field of study

In 1975, a breakthrough result of L. Valiant showed that parsing context free grammars can be reduced to Boolean matrix multiplication, resulting in a running time of O(n^omega) for parsing where omega <= 2.373 is the exponent of fast matrix multiplication, and n is the string length. Recently, Abboud, Backurs and V. Williams (FOCS 2015) demonstrated that this is likely optimal; moreover, a combinatorial o(n^3) algorithm is unlikely to exist for the general parsing problem. The language edit distance problem is a significant generalization of the parsing problem, which computes the minimum edit distance of a given string (using insertions, deletions, and substitutions) to any valid string in the language, and has received significant attention both in theory and practice since the seminal work of Aho and Peterson in 1972. Clearly, the lower bound for parsing rules out any algorithm running in o(n^omega) time that can return a nontrivial multiplicative approximation of the language edit distance problem. Furthermore, combinatorial algorithms with cubic running time or algorithms that use fast matrix multiplication are often not desirable in practice. To break this n^omega hardness barrier, in this paper we study additive approximation algorithms for language edit distance. We provide two explicit combinatorial algorithms to obtain a string with minimum edit distance with performance dependencies on either the number of non-linear productions, k^*, or the number of nested non-linear production, k, used in the optimal derivation. Explicitly, we give an additive O(k^*gamma) approximation in time O(|G|(n^2 + (n/gamma)^3)) and an additive O(k gamma) approximation in time O(|G|(n^2 + (n^3/gamma^2))), where |G| is the grammar size and n is the string length. In particular, we obtain tight approximations for an important subclass of context free grammars known as ultralinear grammars, for which k and k^* are naturally bounded. Interestingly, we show that the same conditional lower bound for parsing context free grammars holds for the class of ultralinear grammars as well, clearly marking the boundary where parsing becomes hard

Dagstuhl Research Online Publication Server

IST Austria Technical Report

Author: Chatterjee Krishnendu
Henzinger Thomas A
Ibsen-Jensen Rasmus
Otop Jan
Publication venue: IST Austria
Publication date: 01/01/2015
Field of study

The edit distance between two words w1, w2 is the minimal number of word operations (letter insertions, deletions, and substitutions) necessary to transform w1 to w2. The edit distance generalizes to languages L1, L2, where the edit distance is the minimal number k such that for every word from L1 there exists a word in L2 with edit distance at most k. We study the edit distance computation problem between pushdown automata and their subclasses. The problem of computing edit distance to a pushdown automaton is undecidable, and in practice, the interesting question is to compute the edit distance from a pushdown automaton (the implementation, a standard model for programs with recursion) to a regular language (the specification). In this work, we present a complete picture of decidability and complexity for deciding whether, for a given threshold k, the edit distance from a pushdown automaton to a finite automaton is at most k

IST Austria: PubRep (Institute of Science and Technology)

A Framework of Quantum Strong Exponential-Time Hypotheses

Author: Buhrman Harry
Patro Subhasree
Speelman Florian
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 38th International Symposium on Theoretical Aspects of Computer Science (STACS 2021)
Publication date: 01/01/2021
Field of study

Dagstuhl Research Online Publication Server

International Migration, Integration and Social Cohesion online publications

UvA-DARE