64,716 research outputs found
Discourse Structure in Machine Translation Evaluation
In this article, we explore the potential of using sentence-level discourse
structure for machine translation evaluation. We first design discourse-aware
similarity measures, which use all-subtree kernels to compare discourse parse
trees in accordance with the Rhetorical Structure Theory (RST). Then, we show
that a simple linear combination with these measures can help improve various
existing machine translation evaluation metrics regarding correlation with
human judgments both at the segment- and at the system-level. This suggests
that discourse information is complementary to the information used by many of
the existing evaluation metrics, and thus it could be taken into account when
developing richer evaluation metrics, such as the WMT-14 winning combined
metric DiscoTKparty. We also provide a detailed analysis of the relevance of
various discourse elements and relations from the RST parse trees for machine
translation evaluation. In particular we show that: (i) all aspects of the RST
tree are relevant, (ii) nuclearity is more useful than relation type, and (iii)
the similarity of the translation RST tree to the reference tree is positively
correlated with translation quality.Comment: machine translation, machine translation evaluation, discourse
analysis. Computational Linguistics, 201
Better training for function labeling
Function labels enrich constituency parse tree nodes with information about their abstract syntactic and semantic roles. A common way to obtain function-labeled trees is to use a two-stage architecture where first a statistical parser produces the constituent structure and then a second
component such as a classifier adds the missing function tags. In order to achieve optimal results, training
examples for machine-learning-based classifiers should be as similar as possible to the instances seen during prediction. However, the method which has been used so far to obtain training examples for the function labeling classifier suffers from a serious drawback: the training examples come from perfect treebank trees, whereas test
examples are derived from parser-produced, imperfect trees.
We show that extracting training instances from the reparsed training part of the treebank results in better training material as measured by similarity to test instances. We show that our training method achieves statistically significantly higher f-scores on the function labeling task for the English Penn Treebank. Currently our method achieves 91.47% f-score on the section 23 of WSJ, the highest score reported in the literature so far
Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks
Because of their superior ability to preserve sequence information over time,
Long Short-Term Memory (LSTM) networks, a type of recurrent neural network with
a more complex computational unit, have obtained strong results on a variety of
sequence modeling tasks. The only underlying LSTM structure that has been
explored so far is a linear chain. However, natural language exhibits syntactic
properties that would naturally combine words to phrases. We introduce the
Tree-LSTM, a generalization of LSTMs to tree-structured network topologies.
Tree-LSTMs outperform all existing systems and strong LSTM baselines on two
tasks: predicting the semantic relatedness of two sentences (SemEval 2014, Task
1) and sentiment classification (Stanford Sentiment Treebank).Comment: Accepted for publication at ACL 201
Time Series Cluster Kernel for Learning Similarities between Multivariate Time Series with Missing Data
Similarity-based approaches represent a promising direction for time series
analysis. However, many such methods rely on parameter tuning, and some have
shortcomings if the time series are multivariate (MTS), due to dependencies
between attributes, or the time series contain missing data. In this paper, we
address these challenges within the powerful context of kernel methods by
proposing the robust \emph{time series cluster kernel} (TCK). The approach
taken leverages the missing data handling properties of Gaussian mixture models
(GMM) augmented with informative prior distributions. An ensemble learning
approach is exploited to ensure robustness to parameters by combining the
clustering results of many GMM to form the final kernel.
We evaluate the TCK on synthetic and real data and compare to other
state-of-the-art techniques. The experimental results demonstrate that the TCK
is robust to parameter choices, provides competitive results for MTS without
missing data and outstanding results for missing data.Comment: 23 pages, 6 figure
Accuracy-based scoring for phrase-based statistical machine translation
Although the scoring features of state-of-the-art Phrase-Based Statistical Machine Translation (PB-SMT) models are weighted so as to optimise an objective function measuring
translation quality, the estimation of the features
themselves does not have any relation to such quality metrics. In this paper, we introduce a translation quality-based feature to PBSMT in a bid to improve the translation quality of the system. Our feature is estimated by averaging
the edit-distance between phrase pairs involved in the translation of oracle sentences, chosen by automatic evaluation metrics from the N-best outputs of a baseline system, and phrase pairs occurring in the N-best list. Using
our method, we report a statistically significant 2.11% relative improvement in BLEU score for the WMT 2009 Spanish-to-English translation task. We also report that using our
method we can achieve statistically significant improvements over the baseline using many other MT evaluation metrics, and a substantial increase in speed and reduction in memory use (due to a reduction in phrase-table size of 87%) while maintaining significant gains in
translation quality
End-to-end Neural Coreference Resolution
We introduce the first end-to-end coreference resolution model and show that
it significantly outperforms all previous work without using a syntactic parser
or hand-engineered mention detector. The key idea is to directly consider all
spans in a document as potential mentions and learn distributions over possible
antecedents for each. The model computes span embeddings that combine
context-dependent boundary representations with a head-finding attention
mechanism. It is trained to maximize the marginal likelihood of gold antecedent
spans from coreference clusters and is factored to enable aggressive pruning of
potential mentions. Experiments demonstrate state-of-the-art performance, with
a gain of 1.5 F1 on the OntoNotes benchmark and by 3.1 F1 using a 5-model
ensemble, despite the fact that this is the first approach to be successfully
trained with no external resources.Comment: Accepted to EMNLP 201
- …