    Tabular Parsing

    This is a tutorial on tabular parsing, on the basis of tabulation of nondeterministic push-down automata. Discussed are Earley's algorithm, the Cocke-Kasami-Younger algorithm, tabular LR parsing, the construction of parse trees, and further issues.Comment: 21 pages, 14 figure

    Lexical Disambiguation in LTAG using Left Context

    International audienceIn this paper, we present an automaton-based lexical disambiguation process for Lexicalized Tree-Adjoining Grammar (LTAG). This process builds on previous work of Bonfante et al. (2004), and extends it by computing a polarity-based abstraction, which contains information about left context. This extension allows for a faster lexical disambiguation by reducing the filtering automaton

    International audienceIn this paper, we present an optimization for parsing with Lexicalized Tree-Adjoining Grammar which takes inspiration from previous work on polarity based grammar abstraction (Bonfante et al., 2004). We illustrate the impact of this optimization on two benchmarks and we relate our approach to the more general optimization framework proposed for Interaction Grammars by (Bonfante et al., 2009) and (Morey et al., 2011)

    Exact Recursive Probabilistic Programming

    Recursive calls over recursive data are widely useful for generating probability distributions, and probabilistic programming allows computations over these distributions to be expressed in a modular and intuitive way. Exact inference is also useful, but unfortunately, existing probabilistic programming languages do not perform exact inference on recursive calls over recursive data, forcing programmers to code many applications manually. We introduce a probabilistic language in which a wide variety of recursion can be expressed naturally, and inference carried out exactly. For instance, probabilistic pushdown automata and their generalizations are easy to express, and polynomial-time parsing algorithms for them are derived automatically. We eliminate recursive data types using program transformations related to defunctionalization and refunctionalization. These transformations are assured correct by a linear type system, and a successful choice of transformations, if there is one, is guaranteed to be found by a greedy algorithm

    Parsing Directed Acyclic Graphs with Range Concatenation Grammars

    International audienceRange Concatenation Grammars (RCGs) are a syntactic formalism which possesses many attractive properties. It is more powerful than Linear Context-Free Rewriting Systems, though this power is not reached to the detriment of efficiency since its sentences can always be parsed in polynomial time. If the input, instead of a string, is a Directed Acyclic Graph (DAG), only simple RCGs can still be parsed in polynomial time. For non-linear RCGs, this polynomial parsing time cannot be guaranteed anymore. In this paper, we show how the standard parsing algorithm can be adapted for parsing DAGs with RCGs, both in the linear (simple) and in the non-linear case

    A finite state intersection approach to propositional satisfiability

    AbstractWe use a finite state (FSA) construction approach to address the problem of propositional satisfiability (SAT). We present a very simple translation from formulas in conjunctive normal form (CNF) to regular expressions and use regular expressions to construct an FSA. As a consequence of the FSA construction, we obtain an ALL-SAT solver and model counter. This automata construction can be considered essentially a finite state intersection grammar (FSIG). We also show how an FSIG approach can be encoded. Several variable ordering (state ordering) heuristics are compared in terms of the running time of the FSA and FSIG construction. We also present a strategy for clause ordering (automata composition). Running times of state-of-the-art model counters and BDD based SAT solvers are compared and we show that both the FSA and FSIG approaches obtain an state-of-the-art performance on some hard unsatisfiable benchmarks. It is also shown that clause learning techniques can help improve performance. This work brings up many questions on the possible use of automata and grammar models to address SAT

    Subcubic certificates for CFL reachability

    Many problems in interprocedural program analysis can be modeled as the context-free language (CFL) reachability problem on graphs and can be solved in cubic time. Despite years of efforts, there are no known truly sub-cubic algorithms for this problem. We study the related certification task: given an instance of CFL reachability, are there small and efficiently checkable certificates for the existence and for the non-existence of a path? We show that, in both scenarios, there exist succinct certificates (O(n^2) in the size of the problem) and these certificates can be checked in subcubic (matrix multiplication) time. The certificates are based on grammar-based compression of paths (for reachability) and on invariants represented as matrix inequalities (for non-reachability). Thus, CFL reachability lies in nondeterministic and co-nondeterministic subcubic time. A natural question is whether faster algorithms for CFL reachability will lead to faster algorithms for combinatorial problems such as Boolean satisfiability (SAT). As a consequence of our certification results, we show that there cannot be a fine-grained reduction from SAT to CFL reachability for a conditional lower bound stronger than n^ω, unless the nondeterministic strong exponential time hypothesis (NSETH) fails. In a nutshell, reductions from SAT are unlikely to explain the cubic bottleneck for CFL reachability. Our results extend to related subcubic equivalent problems: pushdown reachability and 2NPDA recognition; as well as to all-pairs CFL reachability. For example, we describe succinct certificates for pushdown non-reachability (inductive invariants) and observe that they can be checked in matrix multiplication time. We also extract a new hardest 2NPDA language, capturing the “hard core” of all these problems