Search CORE

8,500 research outputs found

Viterbi Training for PCFGs: Hardness Results and Competitiveness of Uniform Initialization

Author: Cohen S. B.
Smith N. A.
Publication venue
Publication date: 01/01/2010
Field of study

We consider the search for a maximum likelihood assignment of hidden derivations and grammar weights for a probabilistic context-free grammar, the problem approximately solved by “Viterbi training.” We show that solving and even approximating Viterbi training for PCFGs is NP-hard. We motivate the use of uniformat-random initialization for Viterbi EM as an optimal initializer in absence of further information about the correct model parameters, providing an approximate bound on the log-likelihood.

CiteSeerX

Edinburgh Research Explorer

Principles and Implementation of Deductive Parsing

Author: Pereira Fernando C. N.
Schabes Yves
Shieber Stuart M.
Publication venue
Publication date: 01/01/1994
Field of study

We present a system for generating parsers based directly on the metaphor of parsing as deduction. Parsing algorithms can be represented directly as deduction systems, and a single deduction engine can interpret such deduction systems so as to implement the corresponding parser. The method generalizes easily to parsers for augmented phrase structure formalisms, such as definite-clause grammars and other logic grammar formalisms, and has been used for rapid prototyping of parsing algorithms for a variety of formalisms including variants of tree-adjoining grammars, categorial grammars, and lexicalized context-free grammars.Comment: 69 pages, includes full Prolog cod

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

Harvard University - DASH

Recommended from our members

The automated inference of tree system

Author: Levine Barry Arthur
Publication venue: 'Oregon State University'
Publication date
Field of study

Tree systems are used in syntactic pattern recognition for describing two-dimensional patterns. We extend results on tree automata with the introduction of the subtree-invariant equivalence relation R. R relates two trees when the appearance of one implies the appearance of the other in similar trees. A new state minimizing algorithm for tree automata is formed using R. We also determine a bound for Brainerd's minimization method. We introduce the Group Unordered Tree Automaton (GUTA) which accepts all orientations of open-line patterns described using directed arc primitives. The specification of a GUTA includes an unordered tree automaton M, which only accepts a standard orientation of a given class of open-line pictures, and a transformation group, which describes how the primitives transform under rotational shifts. The GUTA performs all orientational parses in parallel, reports all successful transformations and operates in the same time complexity as M. The GUTA is much easier to specify than the equivalent non-decomposed unordered tree automaton. The problem of automating the design of unordered and ordered tree automata (grammars) is studied both on a system directed and on a highly interactive level. The system directed method uses Pao's lattice technique to infer tree automata (grammars) from structurally complete samples. It is shown that the method can infer any context-free grammar when provided with skeletal structure descriptions. This extends the results of Pao which only deal with proper subclasses of context-free grammars. The highly interactive inference system is based on the use of tree derivatives, also introduced in this thesis, for determining automaton states and possible state merging. Tree derivatives are sets of tree forms derived by replacing selected subtrees with marked nodes. The derivative sets are used to determine subtree-invariant equivalence relations which characterize tree automata. A minimization algorithm based on tree derivatives is given. We use tree derivatives to prove that a tree automaton with n states can be fully characterized by the set of trees that it accepts of depth at most 2n. The inference method compares tree derivative sets and infers subtree-invariant equivalence relations. A relation is inferred if there is sufficient overlap between the derivative sets. Our method was compared to other tree automata inference schemes, including Crespi-Reghizzi's algorithm. We have shown that our method is applicable to the entire class of context-free grammars and requires a smaller sample than Crespi-Reghizzi's algorithm which can only infer a proper subclass of operator precedence grammars. Furthermore, it appears more general than the other inference systems for tree automata or grammars

ScholarsArchive@OSU

An Alternative Conception of Tree-Adjoining Derivation

Author: Schabes Yves
Shieber Stuart M.
Publication venue
Publication date: 01/01/1994
Field of study

The precise formulation of derivation for tree-adjoining grammars has important ramifications for a wide variety of uses of the formalism, from syntactic analysis to semantic interpretation and statistical language modeling. We argue that the definition of tree-adjoining derivation must be reformulated in order to manifest the proper linguistic dependencies in derivations. The particular proposal is both precisely characterizable through a definition of TAG derivations as equivalence classes of ordered derivation trees, and computationally operational, by virtue of a compilation to linear indexed grammars together with an efficient algorithm for recognition and parsing according to the compiled grammar.Comment: 33 page

arXiv.org e-Print Archive

CiteSeerX

Harvard University - DASH

Grammar induction for mildly context sensitive languages using variational Bayesian inference

Author: Bergen Leon
Bruno Chris
Harasim Daniel
O'Donnell Timothy J.
Portelance Eva
Publication venue
Publication date: 06/08/2014
Field of study

The following technical report presents a formal approach to probabilistic minimalist grammar induction. We describe a formalization of a minimalist grammar. Based on this grammar, we define a generative model for minimalist derivations. We then present a generalized algorithm for the application of variational Bayesian inference to lexicalized mildly context sensitive language grammars which in this paper is applied to the previously defined minimalist grammar

arXiv.org e-Print Archive

Dryad Digital Repository (Duke University)

Weakly Restricted Stochastic Grammars

Author: Akker Rieks op den
Doest Hugo ter
Publication venue: Association for Computational Linguistics
Publication date: 01/01/1994
Field of study

A new type of stochastic grammars is introduced for investigation: weakly restricted stochastic grammars. In this paper we will concentrate on the consistency problem. To find conditions for stochastic grammars to be consistent, the theory of multitype Galton-Watson branching processes and generating functions is of central importance.\ud The unrestricted stochastic grammar formalism generates the same class of languages as the weakly restricted formalism. The inside-outside algorithm is adapted for use with weakly restricted grammars

CiteSeerX

Crossref

University of Twente Research Information