39,533 research outputs found
Explicit length modelling for statistical machine translation
[EN] Explicit length modelling has been previously explored in statistical pattern recognition with successful
results. In this paper, two length models along with two parameter estimation methods and two
alternative parametrisations for statistical machine translation (SMT) are presented. More precisely, we
incorporate explicit bilingual length modelling in a state-of-the-art log-linear SMT system as an
additional feature function in order to prove the contribution of length information. Finally, a
systematic evaluation on reference SMT tasks considering different language pairs proves the benefits
of explicit length modelling.Work supported by the EC (FEDER/FSE) under the transLectures project (FP7-ICT-2011-7-287755) and the Spanish MEC/MICINN under the MIPRCV "Consolider Ingenio 2010" program (CSD2007-00018) and iTrans2 (TIN2009-14511) projects and FPU grant (AP2010-4349). Also supported by the Spanish MITyC under the erudito.com (TSI-020110-2009-439) project and by the Generalitat Valenciana under grants Prometeo/2009/014 and GV/2010/067, and by the UPV under the AdInTAO (20091027) project. The authors wish to thank the anonymous reviewers for their criticisms and suggestions.Silvestre Cerdà , JA.; Andrés Ferrer, J.; Civera Saiz, J. (2012). Explicit length modelling for statistical machine translation. Pattern Recognition. 45(9):3183-3192. https://doi.org/10.1016/j.patcog.2012.01.006S3183319245
Character-level Transformer-based Neural Machine Translation
Neural machine translation (NMT) is nowadays commonly applied at the subword
level, using byte-pair encoding. A promising alternative approach focuses on
character-level translation, which simplifies processing pipelines in NMT
considerably. This approach, however, must consider relatively longer
sequences, rendering the training process prohibitively expensive. In this
paper, we discuss a novel, Transformer-based approach, that we compare, both in
speed and in quality to the Transformer at subword and character levels, as
well as previously developed character-level models. We evaluate our models on
4 language pairs from WMT'15: DE-EN, CS-EN, FI-EN and RU-EN. The proposed novel
architecture can be trained on a single GPU and is 34% percent faster than the
character-level Transformer; still, the obtained results are at least on par
with it. In addition, our proposed model outperforms the subword-level model in
FI-EN and shows close results in CS-EN. To stimulate further research in this
area and close the gap with subword-level NMT, we make all our code and models
publicly available
A Neural Attention Model for Abstractive Sentence Summarization
Summarization based on text extraction is inherently limited, but
generation-style abstractive methods have proven challenging to build. In this
work, we propose a fully data-driven approach to abstractive sentence
summarization. Our method utilizes a local attention-based model that generates
each word of the summary conditioned on the input sentence. While the model is
structurally simple, it can easily be trained end-to-end and scales to a large
amount of training data. The model shows significant performance gains on the
DUC-2004 shared task compared with several strong baselines.Comment: Proceedings of EMNLP 201
Neural Machine Translation by Generating Multiple Linguistic Factors
Factored neural machine translation (FNMT) is founded on the idea of using
the morphological and grammatical decomposition of the words (factors) at the
output side of the neural network. This architecture addresses two well-known
problems occurring in MT, namely the size of target language vocabulary and the
number of unknown tokens produced in the translation. FNMT system is designed
to manage larger vocabulary and reduce the training time (for systems with
equivalent target language vocabulary size). Moreover, we can produce
grammatically correct words that are not part of the vocabulary. FNMT model is
evaluated on IWSLT'15 English to French task and compared to the baseline
word-based and BPE-based NMT systems. Promising qualitative and quantitative
results (in terms of BLEU and METEOR) are reported.Comment: 11 pages, 3 figues, SLSP conferenc
- …