Search CORE

1,070 research outputs found

BattRAE: Bidimensional Attention-Based Recursive Autoencoders for Learning Bilingual Phrase Embeddings

Author: Su Jinsong
Xiong Deyi
Zhang Biao
Publication venue
Publication date: 24/11/2016
Field of study

In this paper, we propose a bidimensional attention based recursive autoencoder (BattRAE) to integrate clues and sourcetarget interactions at multiple levels of granularity into bilingual phrase representations. We employ recursive autoencoders to generate tree structures of phrases with embeddings at different levels of granularity (e.g., words, sub-phrases and phrases). Over these embeddings on the source and target side, we introduce a bidimensional attention network to learn their interactions encoded in a bidimensional attention matrix, from which we extract two soft attention weight distributions simultaneously. These weight distributions enable BattRAE to generate compositive phrase representations via convolution. Based on the learned phrase representations, we further use a bilinear neural model, trained via a max-margin method, to measure bilingual semantic similarity. To evaluate the effectiveness of BattRAE, we incorporate this semantic similarity as an additional feature into a state-of-the-art SMT system. Extensive experiments on NIST Chinese-English test sets show that our model achieves a substantial improvement of up to 1.63 BLEU points on average over the baseline.Comment: 7 pages, accepted by AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Word Representation Models for Morphologically Rich Languages in Neural Machine Translation

Author: Cohn Trevor
Haffari Gholamreza
He Xuanli
Vylomova Ekaterina
Publication venue
Publication date: 14/06/2016
Field of study

Dealing with the complex word forms in morphologically rich languages is an open problem in language processing, and is particularly important in translation. In contrast to most modern neural systems of translation, which discard the identity for rare words, in this paper we propose several architectures for learning word representations from character and morpheme level word decompositions. We incorporate these representations in a novel machine translation model which jointly learns word alignments and translations via a hard attention mechanism. Evaluating on translating from several morphologically rich languages into English, we show consistent improvements over strong baseline methods, of between 1 and 1.5 BLEU points

arXiv.org e-Print Archive

Monash University Research Portal

Natural Language Processing with Small Feed-Forward Networks

Author: Bakalov Anton
Botha Jan A.
Ma Ji
McDonald Ryan
Petrov Slav
Pitler Emily
Salcianu Alex
Weiss David
Publication venue
Publication date: 01/01/2017
Field of study

We show that small and shallow feed-forward neural networks can achieve near state-of-the-art results on a range of unstructured and structured language processing tasks while being considerably cheaper in memory and computational requirements than deep recurrent models. Motivated by resource-constrained environments like mobile phones, we showcase simple techniques for obtaining such small neural network models, and investigate different tradeoffs when deciding how to allocate a small memory budget.Comment: EMNLP 2017 short pape

arXiv.org e-Print Archive

Crossref