2,694 research outputs found
Context-aware graph segmentation for graph-based translation
In this paper, we present an improved
graph-based translation model which segments an input graph into node-induced
subgraphs by taking source context into
consideration. Translations are generated
by combining subgraph translations leftto-right using beam search. Experiments
on Chinese–English and German–English
demonstrate that the context-aware segmentation significantly improves the baseline
graph-based model
Chinese–Spanish neural machine translation enhanced with character and word bitmap fonts
Recently, machine translation systems based on neural networks have reached state-of-the-art results for some pairs of languages (e.g., German–English). In this paper, we are investigating the performance of neural machine translation in Chinese–Spanish, which is a challenging language pair. Given that the meaning of a Chinese word can be related to its graphical representation, this work aims to enhance neural machine translation by using as input a combination of: words or characters and their corresponding bitmap fonts. The fact of performing the interpretation of every word or character as a bitmap font generates more informed vectorial representations. Best results are obtained when using words plus their bitmap fonts obtaining an improvement (over a competitive neural MT baseline system) of almost six BLEU, five METEOR points and ranked coherently better in the human evaluation.Peer ReviewedPostprint (published version
Improving the minimum description length inference of phrase-based translation models
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-19390-8_25We study the application of minimum description length
(MDL) inference to estimate pattern recognition models for machine
translation. MDL is a theoretically-sound approach whose empirical
results are however below those of the state-of-the-art pipeline of training
heuristics. We identify potential limitations of current MDL procedures
and provide a practical approach to overcome them. Empirical results
support the soundness of the proposed approach.Work supported by the EU 7th Framework Programme (FP7/2007–2013) under the CasMaCat project (grant agreement no 287576), by Spanish MICINN under grant TIN2012-31723, and by the Generalitat Valenciana under grant ALMPR (Prometeo/2009/014).Gonzalez Rubio, J.; Casacuberta Nolla, F. (2015). Improving the minimum description length inference of phrase-based translation models. En Pattern Recognition and Image Analysis: 7th Iberian Conference, IbPRIA 2015, Santiago de Compostela, Spain, June 17-19, 2015, Proceedings. Springer International Publishing. 219-227. https://doi.org/10.1007/978-3-319-19390-8 25S21922
- …