35 research outputs found
An Empirical Comparison of Parsing Methods for Stanford Dependencies
Stanford typed dependencies are a widely desired representation of natural
language sentences, but parsing is one of the major computational bottlenecks
in text analysis systems. In light of the evolving definition of the Stanford
dependencies and developments in statistical dependency parsing algorithms,
this paper revisits the question of Cer et al. (2010): what is the tradeoff
between accuracy and speed in obtaining Stanford dependencies in particular? We
also explore the effects of input representations on this tradeoff:
part-of-speech tags, the novel use of an alternative dependency representation
as input, and distributional representaions of words. We find that direct
dependency parsing is a more viable solution than it was found to be in the
past. An accompanying software release can be found at:
http://www.ark.cs.cmu.edu/TBSDComment: 13 pages, 2 figure
Learning to Compose Task-Specific Tree Structures
For years, recursive neural networks (RvNNs) have been shown to be suitable
for representing text into fixed-length vectors and achieved good performance
on several natural language processing tasks. However, the main drawback of
RvNNs is that they require structured input, which makes data preparation and
model implementation hard. In this paper, we propose Gumbel Tree-LSTM, a novel
tree-structured long short-term memory architecture that learns how to compose
task-specific tree structures only from plain text data efficiently. Our model
uses Straight-Through Gumbel-Softmax estimator to decide the parent node among
candidates dynamically and to calculate gradients of the discrete decision. We
evaluate the proposed model on natural language inference and sentiment
analysis, and show that our model outperforms or is at least comparable to
previous models. We also find that our model converges significantly faster
than other models.Comment: AAAI 201
Resource Constrained Structured Prediction
We study the problem of structured prediction under test-time budget
constraints. We propose a novel approach applicable to a wide range of
structured prediction problems in computer vision and natural language
processing. Our approach seeks to adaptively generate computationally costly
features during test-time in order to reduce the computational cost of
prediction while maintaining prediction performance. We show that training the
adaptive feature generation system can be reduced to a series of structured
learning problems, resulting in efficient training using existing structured
learning algorithms. This framework provides theoretical justification for
several existing heuristic approaches found in literature. We evaluate our
proposed adaptive system on two structured prediction tasks, optical character
recognition (OCR) and dependency parsing and show strong performance in
reduction of the feature costs without degrading accuracy
Elimination of Spurious Ambiguity in Transition-Based Dependency Parsing
We present a novel technique to remove spurious ambiguity from transition
systems for dependency parsing. Our technique chooses a canonical sequence of
transition operations (computation) for a given dependency tree. Our technique
can be applied to a large class of bottom-up transition systems, including for
instance Nivre (2004) and Attardi (2006)
Solving General Arithmetic Word Problems
This paper presents a novel approach to automatically solving arithmetic word
problems. This is the first algorithmic approach that can handle arithmetic
problems with multiple steps and operations, without depending on additional
annotations or predefined templates. We develop a theory for expression trees
that can be used to represent and evaluate the target arithmetic expressions;
we use it to uniquely decompose the target arithmetic problem to multiple
classification problems; we then compose an expression tree, combining these
with world knowledge through a constrained inference framework. Our classifiers
gain from the use of {\em quantity schemas} that supports better extraction of
features. Experimental results show that our method outperforms existing
systems, achieving state of the art performance on benchmark datasets of
arithmetic word problems.Comment: EMNLP 201