790 research outputs found
Fast(er) Exact Decoding and Global Training for Transition-Based Dependency Parsing via a Minimal Feature Set
We first present a minimal feature set for transition-based dependency
parsing, continuing a recent trend started by Kiperwasser and Goldberg (2016a)
and Cross and Huang (2016a) of using bi-directional LSTM features. We plug our
minimal feature set into the dynamic-programming framework of Huang and Sagae
(2010) and Kuhlmann et al. (2011) to produce the first implementation of
worst-case O(n^3) exact decoders for arc-hybrid and arc-eager transition
systems. With our minimal features, we also present O(n^3) global training
methods. Finally, using ensembles including our new parsers, we achieve the
best unlabeled attachment score reported (to our knowledge) on the Chinese
Treebank and the "second-best-in-class" result on the English Penn Treebank.Comment: Proceedings of EMNLP, 2017. 12 page
The (Non-)Utility of Structural Features in BiLSTM-based Dependency Parsers
Classical non-neural dependency parsers put considerable effort on the design
of feature functions. Especially, they benefit from information coming from
structural features, such as features drawn from neighboring tokens in the
dependency tree. In contrast, their BiLSTM-based successors achieve
state-of-the-art performance without explicit information about the structural
context. In this paper we aim to answer the question: How much structural
context are the BiLSTM representations able to capture implicitly? We show that
features drawn from partial subtrees become redundant when the BiLSTMs are
used. We provide a deep insight into information flow in transition- and
graph-based neural architectures to demonstrate where the implicit information
comes from when the parsers make their decisions. Finally, with model ablations
we demonstrate that the structural context is not only present in the models,
but it significantly influences their performance
Two Local Models for Neural Constituent Parsing
Non-local features have been exploited by syntactic parsers for capturing
dependencies between sub output structures. Such features have been a key to
the success of state-of-the-art statistical parsers. With the rise of deep
learning, however, it has been shown that local output decisions can give
highly competitive accuracies, thanks to the power of dense neural input
representations that embody global syntactic information. We investigate two
conceptually simple local neural models for constituent parsing, which make
local decisions to constituent spans and CFG rules, respectively. Consistent
with previous findings along the line, our best model gives highly competitive
results, achieving the labeled bracketing F1 scores of 92.4% on PTB and 87.3%
on CTB 5.1.Comment: COLING 201
Dependency-based Hybrid Trees for Semantic Parsing
We propose a novel dependency-based hybrid tree model for semantic parsing,
which converts natural language utterance into machine interpretable meaning
representations. Unlike previous state-of-the-art models, the semantic
information is interpreted as the latent dependency between the natural
language words in our joint representation. Such dependency information can
capture the interactions between the semantics and natural language words. We
integrate a neural component into our model and propose an efficient
dynamic-programming algorithm to perform tractable inference. Through extensive
experiments on the standard multilingual GeoQuery dataset with eight languages,
we demonstrate that our proposed approach is able to achieve state-of-the-art
performance across several languages. Analysis also justifies the effectiveness
of using our new dependency-based representation.Comment: Accepted by EMNLP 201
Global Transition-based Non-projective Dependency Parsing
Shi, Huang, and Lee (2017) obtained state-of-the-art results for English and
Chinese dependency parsing by combining dynamic-programming implementations of
transition-based dependency parsers with a minimal set of bidirectional LSTM
features. However, their results were limited to projective parsing. In this
paper, we extend their approach to support non-projectivity by providing the
first practical implementation of the MH_4 algorithm, an mildly
nonprojective dynamic-programming parser with very high coverage on
non-projective treebanks. To make MH_4 compatible with minimal transition-based
feature sets, we introduce a transition-based interpretation of it in which
parser items are mapped to sequences of transitions. We thus obtain the first
implementation of global decoding for non-projective transition-based parsing,
and demonstrate empirically that it is more effective than its projective
counterpart in parsing a number of highly non-projective languagesComment: Proceedings of ACL 2018. 13 page
Improving Coverage and Runtime Complexity for Exact Inference in Non-Projective Transition-Based Dependency Parsers
We generalize Cohen, G\'omez-Rodr\'iguez, and Satta's (2011) parser to a
family of non-projective transition-based dependency parsers allowing
polynomial-time exact inference. This includes novel parsers with better
coverage than Cohen et al. (2011), and even a variant that reduces time
complexity to , improving over the known bounds in exact inference for
non-projective transition-based parsing. We hope that this piece of theoretical
work inspires design of novel transition systems with better coverage and
better run-time guarantees.
Code available at https://github.com/tzshi/nonproj-dp-variants-naacl2018Comment: Proceedings of NAACL-HLT 2018. 6 pages. This version fixes display
issue in an author nam
Viable Dependency Parsing as Sequence Labeling
We recast dependency parsing as a sequence labeling problem, exploring
several encodings of dependency trees as labels. While dependency parsing by
means of sequence labeling had been attempted in existing work, results
suggested that the technique was impractical. We show instead that with a
conventional BiLSTM-based model it is possible to obtain fast and accurate
parsers. These parsers are conceptually simple, not needing traditional parsing
algorithms or auxiliary structures. However, experiments on the PTB and a
sample of UD treebanks show that they provide a good speed-accuracy tradeoff,
with results competitive with more complex approaches.Comment: Camera-ready version to appear at NAACL 2019 (final peer-reviewed
manuscript). 8 pages (incl. appendix
Non-Projective Dependency Parsing with Non-Local Transitions
We present a novel transition system, based on the Covington non-projective
parser, introducing non-local transitions that can directly create arcs
involving nodes to the left of the current focus positions. This avoids the
need for long sequences of No-Arc transitions to create long-distance arcs,
thus alleviating error propagation. The resulting parser outperforms the
original version and achieves the best accuracy on the Stanford Dependencies
conversion of the Penn Treebank among greedy transition-based algorithms.Comment: Proceedings of NAACL-HLT 2018. 8 page
Non-Projective Dependency Parsing via Latent Heads Representation (LHR)
In this paper, we introduce a novel approach based on a bidirectional
recurrent autoencoder to perform globally optimized non-projective dependency
parsing via semi-supervised learning. The syntactic analysis is completed at
the end of the neural process that generates a Latent Heads Representation
(LHR), without any algorithmic constraint and with a linear complexity. The
resulting "latent syntactic structure" can be used directly in other semantic
tasks. The LHR is transformed into the usual dependency tree computing a simple
vectors similarity. We believe that our model has the potential to compete with
much more complex state-of-the-art parsing architectures
Dynamic Oracles for Top-Down and In-Order Shift-Reduce Constituent Parsing
We introduce novel dynamic oracles for training two of the most accurate
known shift-reduce algorithms for constituent parsing: the top-down and
in-order transition-based parsers. In both cases, the dynamic oracles manage to
notably increase their accuracy, in comparison to that obtained by performing
classic static training. In addition, by improving the performance of the
state-of-the-art in-order shift-reduce parser, we achieve the best accuracy to
date (92.0 F1) obtained by a fully-supervised single-model greedy shift-reduce
constituent parser on the WSJ benchmark.Comment: Proceedings of EMNLP 2018. 11 page
- …