39 research outputs found
Parsing Inside-Out
The inside-outside probabilities are typically used for reestimating
Probabilistic Context Free Grammars (PCFGs), just as the forward-backward
probabilities are typically used for reestimating HMMs. I show several novel
uses, including improving parser accuracy by matching parsing algorithms to
evaluation criteria; speeding up DOP parsing by 500 times; and 30 times faster
PCFG thresholding at a given accuracy level. I also give an elegant,
state-of-the-art grammar formalism, which can be used to compute inside-outside
probabilities; and a parser description formalism, which makes it easy to
derive inside-outside formulas and many others.Comment: Ph.D. Thesis, 257 pages, 40 postscript figure
An improved parser for data-oriented lexical-functional analysis
We present an LFG-DOP parser which uses fragments from LFG-annotated
sentences to parse new sentences. Experiments with the Verbmobil and Homecentre
corpora show that (1) Viterbi n best search performs about 100 times faster
than Monte Carlo search while both achieve the same accuracy; (2) the DOP
hypothesis which states that parse accuracy increases with increasing fragment
size is confirmed for LFG-DOP; (3) LFG-DOP's relative frequency estimator
performs worse than a discounted frequency estimator; and (4) LFG-DOP
significantly outperforms Tree-DOP is evaluated on tree structures only.Comment: 8 page
A Re-ranking Model for Dependency Parser with Recursive Convolutional Neural Network
In this work, we address the problem to model all the nodes (words or
phrases) in a dependency tree with the dense representations. We propose a
recursive convolutional neural network (RCNN) architecture to capture syntactic
and compositional-semantic representations of phrases and words in a dependency
tree. Different with the original recursive neural network, we introduce the
convolution and pooling layers, which can model a variety of compositions by
the feature maps and choose the most informative compositions by the pooling
layers. Based on RCNN, we use a discriminative model to re-rank a -best list
of candidate dependency parsing trees. The experiments show that RCNN is very
effective to improve the state-of-the-art dependency parsing on both English
and Chinese datasets
GF-DOP: grammatical feature data-oriented parsing
This paper proposes an extension of Tree-DOP which approximates the LFG-DOP model. GF-DOP combines the robustness of the DOP model with some of the linguistic competence of LFG. LFG c-structure trees are augmented with LFG functional information, with the aim of (i) generating
more informative parses than Tree-DOP; (ii) improving overall parse ranking by modelling grammatical features; and (iii) avoiding the inconsistent probability models of LFG-DOP. In a number of experiments on the HomeCentre corpus, we report on which (groups of) features most heavily influence parse quality, both positively and negatively