Search CORE

558 research outputs found

From Word to Sense Embeddings: A Survey on Vector Representations of Meaning

Author: Camacho-Collados Jose
Pilehvar Mohammad Taher
Publication venue
Publication date: 26/10/2018
Field of study

Over the past years, distributed semantic representations have proved to be effective and flexible keepers of prior knowledge to be integrated into downstream applications. This survey focuses on the representation of meaning. We start from the theoretical background behind word vector space models and highlight one of their major limitations: the meaning conflation deficiency, which arises from representing a word with all its possible meanings as a single vector. Then, we explain how this deficiency can be addressed through a transition from the word level to the more fine-grained level of word senses (in its broader acceptation) as a method for modelling unambiguous lexical meaning. We present a comprehensive overview of the wide range of techniques in the two main branches of sense representation, i.e., unsupervised and knowledge-based. Finally, this survey covers the main evaluation procedures and applications for this type of representation, and provides an analysis of four of its important aspects: interpretability, sense granularity, adaptability to different domains and compositionality.Comment: 46 pages, 8 figures. Published in Journal of Artificial Intelligence Researc

arXiv.org e-Print Archive

Online Research @ Cardiff

Traducción Automática Neuronal no Supervisada, un nuevo paradigma basado solo en textos monolingües

Author: Agirre Bengoa Eneko
Artetxe Mikel
Labaka Intxauspe Gorka
Publication venue: Sociedad Española para el Procesamiento del Lenguaje Natural
Publication date: 01/01/2019
Field of study

This article presents UnsupNMT, a 3-year project of which the first year has already been completed. UnsupNMT proposes a radically different approach to machine translation: unsupervised translation, that is, translation based on monolingual data alone with no need for bilingual resources. This method is based on deep learning of temporal sequences and uses cutting-edge interlingual word representations in the form of cross-lingual word embeddings. This project is not only a highly innovative proposal but it also opens a new paradigm in machine translation which branches out to other disciplines, such us transfer learning. Despite the current limitations of unsupervised machine translation, the techniques developed are expected to have great repercussions in areas where machine translation achieves worse results, such as translation between languages which have little contact, e.g. German and Russian.Este artículo presenta UnsupNMT, un proyecto de 3 años del que ha trascurrido la primera anualidad. UnsupNMT plantea un método radicalmente diferente de hacer traducción automática: la traducción no supervisada, es decir, basada exclusivamente en textos monolingües sin ningún recurso bilingüe. El método propuesto se basa en aprendizaje profundo de secuencias temporales combinado con los últimos avances en representación interlingual de palabras (“cross-lingual word embeddings”). Además de ser una propuesta propiamente innovadora, abre un nuevo paradigma de traducción automática con ramificaciones en otras disciplinas como el aprendizaje por transferencia (“transfer learning”). A pesar de las limitaciones actuales de la traducción automática no-supervisada, se espera que las técnicas desarrolladas tengan gran repercusión en áreas donde la traducción automática consigue peores resultados, como la traducción entre pares de idiomas con poco contacto, tales como alemán o ruso.UnsupNMT is a project funded by the Spanish Ministry of Economy, Industry and Competitiveness (TIN2017-91692-EXP)

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas