5 research outputs found

    Tailoring neural architectures for translating from morphologically rich languages

    Get PDF
    A morphologically complex word (MCW) is a hierarchical constituent with meaning-preserving subunits, so word-based models which rely on surface forms might not be powerful enough to translate such structures. When translating from morphologically rich languages (MRLs), a source word could be mapped to several words or even a full sentence on the target side, which means an MCW should not be treated as an atomic unit. In order to provide better translations for MRLs, we boost the existing neural machine translation (NMT) architecture with a doublechannel encoder and a double-attentive decoder. The main goal targeted in this research is to provide richer information on the encoder side and redesign the decoder accordingly to benefit from such information. Our experimental results demonstrate that we could achieve our goal as the proposed model outperforms existing subword- and character-based architectures and showed significant improvements on translating from German, Russian, and Turkish into English

    adaptNMT: an open-source, language-agnostic development environment for neural machine translation

    Get PDF
    adaptNMT streamlines all processes involved in the development and deployment of RNN and Transformer neural translation models. As an open-source application, it is designed for both technical and non-technical users who work in the field of machine translation. Built upon the widely-adopted OpenNMT ecosystem, the application is particularly useful for new entrants to the field since the setup of the development environment and creation of train, validation and test splits is greatly simplified. Graphing, embedded within the application, illustrates the progress of model training, and SentencePiece is used for creating subword segmentation models. Hyperparameter customization is facilitated through an intuitive user interface, and a single-click model development approach has been implemented. Models developed by adaptNMT can be evaluated using a range of metrics, and deployed as a translation service within the application. To support eco-friendly research in the NLP space, a green report also flags the power consumption and kgCO2 emissions generated during model development. The application is freely available (http://github.com/adaptNMT)

    CHALLENGES FACING DIGITAL LANGUAGE TECHNOLOGIES. A GLOTOPOLITICAL PERSPECTIVE

    Get PDF
    En este artículo se abre un espacio de reflexión crítica sobre las consecuencias sociopolíticas y económicas que conlleva la intervención de las nuevas tecnologías digitales del lenguaje en las prácticas lingüísticas y discursivas de los hablantes. Así, desde una perspectiva que combina la glotopolítica y el giro poshumanista, se tratan algunos de los retos que traen estas nuevas tecnologías digitales en la creación y recreación de jerarquías sociolingüísticas y que acarrean inequidad social. Se muestra el poderoso entramado económico que sostiene tecnológicamente a lenguas súpercentrales como el inglés y, en menor medida al español; y, por otro lado, se señalan los efectos negativos para lenguas fuertemente minorizadas y descapitalizadas como las amerindias. Finalmente, se abordan las consecuencias glotopolíticas para el español, entendida como lengua pluricéntrica. Se emplean para ello ejemplos concretos procedentes de dos tecnologías del lenguaje diferentes: los traductores automáticos y las asistentes de voz. Este artículo se ha concebido ante todo como un espigueo sobre el desarrollo y el uso de tecnologías digitales del lenguaje y su relevancia glotopolítica en relación al español y las lenguas amerindias, que merecen estudiarse con mayor calado en investigaciones futuras.This article opens a space for critical reflection on the socio-political and economic consequences of the intervention of new digital language technologies in the linguistic and discursive practices of speakers. In light of this, from a perspective that combines glotopolitics and the post-humanist turn, it addresses some of the challenges posed by these new digital technologies in the creation and recreation of sociolinguistic hierarchies that bring about social inequity. The powerful economic framework that technologically sustains supercentral languages such as English and, to a lesser extent, Spanish, is exposed; and, on the other hand, the negative effects for strongly minoritized and decapitalized languages such as Amerindian languages are pointed out. Finally, it deals with the glotopolitical consequences for Spanish, understood as a pluricentric language. Specific examples are used for this purpose from two different language technologies: automatic translators and voice assistants. This article has been conceived primarily as a deep dive into the development and use of digital language technologies and their glotopolitical relevance in relation to Spanish and Amerindian languages, which deserve to be studied in greater depth in future research

    Retos frente a las tecnologías digitales del lenguaje: Una perspectiva glotopolítica

    Get PDF
    En este artículo se abre un espacio de reflexión crítica sobre las consecuencias sociopolíticas y económicas que conlleva la intervención de las nuevas tecnologías digitales del lenguaje en las prácticas lingüísticas y discursivas de los hablantes. Así, desde una perspectiva que combina la glotopolítica y el giro poshumanista, se tratan algunos de los retos que traen estas nuevas tecnologías digitales en la creación y recreación de jerarquías sociolingüísticas y que acarrean inequidad social. Se muestra el poderoso entramado económico que sostiene tecnológicamente a lenguas súpercentrales como el inglés y, en menor medida al español; y, por otro lado, se señalan los efectos negativos para lenguas fuertemente minorizadas y descapitalizadas como las amerindias. Finalmente, se abordan las consecuencias glotopolíticas para el español, entendida como lengua pluricéntrica. Se emplean para ello ejemplos concretos procedentes de dos tecnologías del lenguaje diferentes: los traductores automáticos y las asistentes de voz. Este artículo se ha concebido ante todo como un espigueo sobre el desarrollo y el uso de tecnologías digitales del lenguaje y su relevancia glotopolítica en relación al español y las lenguas amerindias, que merecen estudiarse con mayor calado en investigaciones futuras

    Tailoring neural architectures for translating from morphologically rich languages

    No full text
    A morphologically complex word (MCW) is a hierarchical constituent with meaning-preserving subunits, so word-based models which rely on surface forms might not be powerful enough to translate such structures. When translating from morphologically rich languages (MRLs), a source word could be mapped to several words or even a full sentence on the target side, which means an MCW should not be treated as an atomic unit. In order to provide better translations for MRLs, we boost the existing neural machine translation (NMT) architecture with a doublechannel encoder and a double-attentive decoder. The main goal targeted in this research is to provide richer information on the encoder side and redesign the decoder accordingly to benefit from such information. Our experimental results demonstrate that we could achieve our goal as the proposed model outperforms existing subword- and character-based architectures and showed significant improvements on translating from German, Russian, and Turkish into English
    corecore