88,178 research outputs found
Sentence-level quality estimation for MT system combination
This paper provides the system description of the Dublin City University system combination module for our participation in the system combination task in the Second Workshop on Applying Machine Learning Techniques to Optimize the Division of Labour in Hybrid MT (ML4HMT- 12). We incorporated a sentence-level quality score, obtained by sentence-level Quality Estimation (QE), as meta information guiding system combination. Instead of using BLEU or (minimum average) TER, we select a backbone for the confusion network using the estimated quality score. For the Spanish-English data, our strategy improved 0.89 BLEU points absolute compared to the best single score and 0.20 BLEU points absolute compared to the standard system combination strateg
Cross-Lingual Adaptation using Structural Correspondence Learning
Cross-lingual adaptation, a special case of domain adaptation, refers to the
transfer of classification knowledge between two languages. In this article we
describe an extension of Structural Correspondence Learning (SCL), a recently
proposed algorithm for domain adaptation, for cross-lingual adaptation. The
proposed method uses unlabeled documents from both languages, along with a word
translation oracle, to induce cross-lingual feature correspondences. From these
correspondences a cross-lingual representation is created that enables the
transfer of classification knowledge from the source to the target language.
The main advantages of this approach over other approaches are its resource
efficiency and task specificity.
We conduct experiments in the area of cross-language topic and sentiment
classification involving English as source language and German, French, and
Japanese as target languages. The results show a significant improvement of the
proposed method over a machine translation baseline, reducing the relative
error due to cross-lingual adaptation by an average of 30% (topic
classification) and 59% (sentiment classification). We further report on
empirical analyses that reveal insights into the use of unlabeled data, the
sensitivity with respect to important hyperparameters, and the nature of the
induced cross-lingual correspondences
Memory-enhanced Decoder for Neural Machine Translation
We propose to enhance the RNN decoder in a neural machine translator (NMT)
with external memory, as a natural but powerful extension to the state in the
decoding RNN. This memory-enhanced RNN decoder is called \textsc{MemDec}. At
each time during decoding, \textsc{MemDec} will read from this memory and write
to this memory once, both with content-based addressing. Unlike the unbounded
memory in previous work\cite{RNNsearch} to store the representation of source
sentence, the memory in \textsc{MemDec} is a matrix with pre-determined size
designed to better capture the information important for the decoding process
at each time step. Our empirical study on Chinese-English translation shows
that it can improve by BLEU upon Groundhog and BLEU upon on Moses,
yielding the best performance achieved with the same training set.Comment: 11 page
- …