Search CORE

39,493 research outputs found

Alineació de paraules i mecanismes d'atenció en sistemes de traducció automàtica neuronal

Author: Safont Gascón Pol
Publication venue
Publication date: 24/01/2022
Field of study

Treballs Finals de Grau d'Enginyeria Informàtica, Facultat de Matemàtiques, Universitat de Barcelona, Any: 2022, Director: Daniel Ortiz Martínez[en] Deep Neural Networks have become the state of the art in many complex computational tasks. While they achieve great improvements over several benchmarking tasks year after year, they seem to operate as black boxes, making it hard for both data scientist and end users to assess their inner decision mechanisms and trust their results. While statistical and interpretable methods are widely used to analyze them, they don’t fully grasp their internal mechanisms and are prone to misleading results, leading to a need for better tools. As a result, self-explaining methods embedded inside the architecture of the neural networks have become a possible alternative, with attention mechanisms as one of the main new technics. The project main focus is the word alignment task, finding the most relevant translation relationships between source and target words in a pair of parallel sentences in different languages. This is a complex task of the Natural Language Processing and machine translation field, and we analyze the use of the novel attention mechanisms embedded in different encoder-decoder neural networks in order to extract the word to word alignments between source and target translations as a byproduct of the translation task. In the first part we analyze the background of the machine translation field: the main traditional statistical methods, the neural machine translation approach to the sequence to sequence problem and finally the word align task and the attention mechanism. In the second part, we implement a machine translation deep neural networks model: a recurrent neural network with an encoder-decoder architecture with attention. And we propose an alignment generation mechanism using the attention layer in order to extract and predict source to target word to word alignments. Finally, we train the neural networks with an English and French bilingual parallel sentence corpus and analyze the experimental results of the model for the translation and align word to word tasks, using a variety of metrics and suggest improvements and alternatives

Diposit Digital de la Universitat de Barcelona

Chinese-Catalan: A neural machine translation approach based on pivoting and attention mechanisms

Author: Casas Manzanares Noé
Escolano Peinado Carlos
Rodríguez Fonollosa José Adrián
Ruiz Costa-Jussà Marta
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

This article innovatively addresses machine translation from Chinese to Catalan using neural pivot strategies trained without any direct parallel data. The Catalan language is very similar to Spanish from a linguistic point of view, which motivates the use of Spanish as pivot language. Regarding neural architecture, we are using the latest state-of-the-art, which is the Transformer model, only based on attention mechanisms. Additionally, this work provides new resources to the community, which consists of a human-developed gold standard of 4,000 sentences between Catalan and Chinese and all the others United Nations official languages (Arabic, English, French, Russian, and Spanish). Results show that the standard pseudo-corpus or synthetic pivot approach performs better than cascade.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Doubly-Attentive Decoder for Multi-modal Neural Machine Translation

Author: Calixto Iacer
Campbell Nick
Liu Qun
Publication venue
Publication date: 01/01/2017
Field of study

We introduce a Multi-modal Neural Machine Translation model in which a doubly-attentive decoder naturally incorporates spatial visual features obtained using pre-trained convolutional neural networks, bridging the gap between image description and translation. Our decoder learns to attend to source-language words and parts of an image independently by means of two separate attention mechanisms as it generates words in the target language. We find that our model can efficiently exploit not just back-translated in-domain multi-modal data but also large general-domain text-only MT corpora. We also report state-of-the-art results on the Multi30k data set.Comment: 8 pages (11 including references), 2 figure

arXiv.org e-Print Archive

Crossref