664 research outputs found

    From Word to Sense Embeddings: A Survey on Vector Representations of Meaning

    Get PDF
    Over the past years, distributed semantic representations have proved to be effective and flexible keepers of prior knowledge to be integrated into downstream applications. This survey focuses on the representation of meaning. We start from the theoretical background behind word vector space models and highlight one of their major limitations: the meaning conflation deficiency, which arises from representing a word with all its possible meanings as a single vector. Then, we explain how this deficiency can be addressed through a transition from the word level to the more fine-grained level of word senses (in its broader acceptation) as a method for modelling unambiguous lexical meaning. We present a comprehensive overview of the wide range of techniques in the two main branches of sense representation, i.e., unsupervised and knowledge-based. Finally, this survey covers the main evaluation procedures and applications for this type of representation, and provides an analysis of four of its important aspects: interpretability, sense granularity, adaptability to different domains and compositionality.Comment: 46 pages, 8 figures. Published in Journal of Artificial Intelligence Researc

    Code-Switching with Word Senses for Pretraining in Neural Machine Translation

    Full text link
    Lexical ambiguity is a significant and pervasive challenge in Neural Machine Translation (NMT), with many state-of-the-art (SOTA) NMT systems struggling to handle polysemous words (Campolungo et al., 2022). The same holds for the NMT pretraining paradigm of denoising synthetic "code-switched" text (Pan et al., 2021; Iyer et al., 2023), where word senses are ignored in the noising stage -- leading to harmful sense biases in the pretraining data that are subsequently inherited by the resulting models. In this work, we introduce Word Sense Pretraining for Neural Machine Translation (WSP-NMT) - an end-to-end approach for pretraining multilingual NMT models leveraging word sense-specific information from Knowledge Bases. Our experiments show significant improvements in overall translation quality. Then, we show the robustness of our approach to scale to various challenging data and resource-scarce scenarios and, finally, report fine-grained accuracy improvements on the DiBiMT disambiguation benchmark. Our studies yield interesting and novel insights into the merits and challenges of integrating word sense information and structured knowledge in multilingual pretraining for NMT.Comment: EMNLP (Findings) 2023 Long Pape

    Towards zero-shot cross-lingual named entity disambiguation

    Get PDF
    [EN]In cross-Lingual Named Entity Disambiguation (XNED) the task is to link Named Entity mentions in text in some native language to English entities in a knowledge graph. XNED systems usually require training data for each native language, limiting their application for low resource languages with small amounts of training data. Prior work have proposed so-called zero-shot transfer systems which are only trained in English training data, but required native prior probabilities of entities with respect to mentions, which had to be estimated from native training examples, limiting their practical interest. In this work we present a zero-shot XNED architecture where, instead of a single disambiguation model, we have a model for each possible mention string, thus eliminating the need for native prior probabilities. Our system improves over prior work in XNED datasets in Spanish and Chinese by 32 and 27 points, and matches the systems which do require native prior information. We experiment with different multilingual transfer strategies, showing that better results are obtained with a purpose-built multilingual pre-training method compared to state-of-the-art generic multilingual models such as XLM-R. We also discovered, surprisingly, that English is not necessarily the most effective zero-shot training language for XNED into English. For instance, Spanish is more effective when training a zero-shot XNED system that dis-ambiguates Basque mentions with respect to an English knowledge graph.This work has been partially funded by the Basque Government (IXA excellence research group (IT1343-19) and DeepText project), Project BigKnowledge (Ayudas Fundacion BBVA a equipos de investigacion cientifica 2018) and via the IARPA BETTER Program contract 2019-19051600006 (ODNI, IARPA activity). Ander Barrena enjoys a post-doctoral grant ESPDOC18/101 from the UPV/EHU and also acknowledges the support of the NVIDIA Corporation with the donation of a Titan V GPU used for this research. The author thankfully acknowledges the computer resources at CTE-Power9 + V100 and technical support provided by Barcelona Supercomputing Center (RES-IM-2020-1-0020)

    A Tour of Explicit Multilingual Semantics: Word Sense Disambiguation, Semantic Role Labeling and Semantic Parsing

    Get PDF
    The recent advent of modern pretrained language models has sparked a revolution in Natural Language Processing (NLP), especially in multilingual and cross-lingual applications. Today, such language models have become the de facto standard for providing rich input representations to neural systems, achieving unprecedented results in an increasing range of benchmarks. However, questions that often arise are: firstly, whether current language models are, indeed, able to capture explicit, symbolic meaning; secondly, if they are, to what extent; thirdly, and perhaps more importantly, whether current approaches are capable of scaling across languages. In this cutting-edge tutorial, we will review recent efforts that have aimed at shedding light on meaning in NLP, with a focus on three key open problems in lexical and sentence-level semantics: Word Sense Disambiguation, Semantic Role Labeling, and Semantic Parsing. After a brief introduction, we will spotlight how state-of-the-art models tackle these tasks in multiple languages, showing where they excel and where they fail. We hope that this tutorial will broaden the audience interested in multilingual semantics and inspire researchers to further advance the field

    Introduction to the special issue on cross-language algorithms and applications

    Get PDF
    With the increasingly global nature of our everyday interactions, the need for multilingual technologies to support efficient and efective information access and communication cannot be overemphasized. Computational modeling of language has been the focus of Natural Language Processing, a subdiscipline of Artificial Intelligence. One of the current challenges for this discipline is to design methodologies and algorithms that are cross-language in order to create multilingual technologies rapidly. The goal of this JAIR special issue on Cross-Language Algorithms and Applications (CLAA) is to present leading research in this area, with emphasis on developing unifying themes that could lead to the development of the science of multi- and cross-lingualism. In this introduction, we provide the reader with the motivation for this special issue and summarize the contributions of the papers that have been included. The selected papers cover a broad range of cross-lingual technologies including machine translation, domain and language adaptation for sentiment analysis, cross-language lexical resources, dependency parsing, information retrieval and knowledge representation. We anticipate that this special issue will serve as an invaluable resource for researchers interested in topics of cross-lingual natural language processing.Postprint (published version

    Two Level Disambiguation Model for Query Translation

    Get PDF
    Selection of the most suitable translation among all translation candidates returned by bilingual dictionary has always been quiet challenging task for any cross language query translation. Researchers have frequently tried to use word co-occurrence statistics to determine the most probable translation for user query. Algorithms using such statistics have certain shortcomings, which are focused in this paper. We propose a novel method for ambiguity resolution, named ‘two level disambiguation model’. At first level disambiguation, the model properly weighs the importance of translation alternatives of query terms obtained from the dictionary. The importance factor measures the probability of a translation candidate of being selected as the final translation of a query term. This removes the problem of taking binary decision for translation candidates. At second level disambiguation, the model targets the user query as a single concept and deduces the translation of all query terms simultaneously, taking into account the weights of translation alternatives also. This is contrary to previous researches which select translation for each word in source language query independently. The experimental result with English-Hindi cross language information retrieval shows that the proposed two level disambiguation model achieved 79.53% and 83.50% of monolingual translation and 21.11% and 17.36% improvement compared to greedy disambiguation strategies in terms of MAP for short and long queries respectively
    • …
    corecore