8,065 research outputs found
Chinese–Spanish neural machine translation enhanced with character and word bitmap fonts
Recently, machine translation systems based on neural networks have reached state-of-the-art results for some pairs of languages (e.g., German–English). In this paper, we are investigating the performance of neural machine translation in Chinese–Spanish, which is a challenging language pair. Given that the meaning of a Chinese word can be related to its graphical representation, this work aims to enhance neural machine translation by using as input a combination of: words or characters and their corresponding bitmap fonts. The fact of performing the interpretation of every word or character as a bitmap font generates more informed vectorial representations. Best results are obtained when using words plus their bitmap fonts obtaining an improvement (over a competitive neural MT baseline system) of almost six BLEU, five METEOR points and ranked coherently better in the human evaluation.Peer ReviewedPostprint (published version
Do Neural Nets Learn Statistical Laws behind Natural Language?
The performance of deep learning in natural language processing has been
spectacular, but the reasons for this success remain unclear because of the
inherent complexity of deep learning. This paper provides empirical evidence of
its effectiveness and of a limitation of neural networks for language
engineering. Precisely, we demonstrate that a neural language model based on
long short-term memory (LSTM) effectively reproduces Zipf's law and Heaps' law,
two representative statistical properties underlying natural language. We
discuss the quality of reproducibility and the emergence of Zipf's law and
Heaps' law as training progresses. We also point out that the neural language
model has a limitation in reproducing long-range correlation, another
statistical property of natural language. This understanding could provide a
direction for improving the architectures of neural networks.Comment: 21 pages, 11 figure
Towards Bidirectional Hierarchical Representations for Attention-Based Neural Machine Translation
This paper proposes a hierarchical attentional neural translation model which
focuses on enhancing source-side hierarchical representations by covering both
local and global semantic information using a bidirectional tree-based encoder.
To maximize the predictive likelihood of target words, a weighted variant of an
attention mechanism is used to balance the attentive information between
lexical and phrase vectors. Using a tree-based rare word encoding, the proposed
model is extended to sub-word level to alleviate the out-of-vocabulary (OOV)
problem. Empirical results reveal that the proposed model significantly
outperforms sequence-to-sequence attention-based and tree-based neural
translation models in English-Chinese translation tasks.Comment: Accepted for publication at EMNLP 201
- …