42 research outputs found
Model Checking Parse Trees
Parse trees are fundamental syntactic structures in both computational
linguistics and compilers construction. We argue in this paper that, in both
fields, there are good incentives for model-checking sets of parse trees for
some word according to a context-free grammar. We put forward the adequacy of
propositional dynamic logic (PDL) on trees in these applications, and study as
a sanity check the complexity of the corresponding model-checking problem:
although complete for exponential time in the general case, we find natural
restrictions on grammars for our applications and establish complexities
ranging from nondeterministic polynomial time to polynomial space in the
relevant cases.Comment: 21 + x page
Grasp: Randomised Semiring Parsing
We present a suite of algorithms for inference tasks over (finite and infinite) context-free sets. For generality and clarity, we have chosen the framework of semiring parsing with support to the most common semirings (e.g. Forest, Viterbi, k-best and Inside). We see parsing from the more general viewpoint of weighted deduction allowing for arbitrary weighted finite-state input and provide implementations of both bottom-up (CKY-inspired) and top-down (Earley-inspired) algorithms. We focus on approximate inference by Monte Carlo methods and provide implementations of ancestral sampling and slice sampling. In principle, sampling methods can deal with models whose independence assumptions are weaker than what is feasible by standard dynamic programming. We envision applications such as monolingual constituency parsing, synchronous parsing, context-free models of reordering for machine translation, and machine translation decoding
Promoting multiword expressions in A* TAG parsing
International audienceMultiword expressions (MWEs) are pervasive in natural languages and often have both idiomatic and compositional readings, which leads to high syntactic ambiguity. We show that for some MWE types idiomatic readings are usually the correct ones. We propose a heuristic for an A* parser for Tree Adjoining Grammars which benefits from this knowledge by promoting MWE-oriented analyses. This strategy leads to a substantial reduction in the parsing search space in case of true positive MWE occurrences, while avoiding parsing failures in case of false positives