Search CORE

2,174 research outputs found

From Word to Sense Embeddings: A Survey on Vector Representations of Meaning

Author: Camacho-Collados Jose
Pilehvar Mohammad Taher
Publication venue
Publication date: 26/10/2018
Field of study

Over the past years, distributed semantic representations have proved to be effective and flexible keepers of prior knowledge to be integrated into downstream applications. This survey focuses on the representation of meaning. We start from the theoretical background behind word vector space models and highlight one of their major limitations: the meaning conflation deficiency, which arises from representing a word with all its possible meanings as a single vector. Then, we explain how this deficiency can be addressed through a transition from the word level to the more fine-grained level of word senses (in its broader acceptation) as a method for modelling unambiguous lexical meaning. We present a comprehensive overview of the wide range of techniques in the two main branches of sense representation, i.e., unsupervised and knowledge-based. Finally, this survey covers the main evaluation procedures and applications for this type of representation, and provides an analysis of four of its important aspects: interpretability, sense granularity, adaptability to different domains and compositionality.Comment: 46 pages, 8 figures. Published in Journal of Artificial Intelligence Researc

arXiv.org e-Print Archive

Online Research @ Cardiff

Unsupervised, Knowledge-Free, and Interpretable Word Sense Disambiguation

Author: Biemann Chris
Faralli Stefano
Marten Fide
Panchenko Alexander
Ponzetto Simone Paolo
Ruppert Eugen
Ustalov Dmitry
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

Interpretability of a predictive model is a powerful feature that gains the trust of users in the correctness of the predictions. In word sense disambiguation (WSD), knowledge-based systems tend to be much more interpretable than knowledge-free counterparts as they rely on the wealth of manually-encoded elements representing word senses, such as hypernyms, usage examples, and images. We present a WSD system that bridges the gap between these two so far disconnected groups of methods. Namely, our system, providing access to several state-of-the-art WSD models, aims to be interpretable as a knowledge-based system while it remains completely unsupervised and knowledge-free. The presented tool features a Web interface for all-word disambiguation of texts that makes the sense predictions human readable by providing interpretable word sense inventories, sense representations, and disambiguation results. We provide a public API, enabling seamless integration.Comment: In Proceedings of the the Conference on Empirical Methods on Natural Language Processing (EMNLP 2017). 2017. Copenhagen, Denmark. Association for Computational Linguistic

arXiv.org e-Print Archive

Crossref

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

MAnnheim DOCument Server

Natural language understanding: instructions for (Present and Future) use

Author: Navigli R.
Publication venue: 'International Joint Conferences on Artificial Intelligence'
Publication date: 01/01/2018
Field of study

In this paper I look at Natural Language Understanding, an area of Natural Language Processing aimed at making sense of text, through the lens of a visionary future: what do we expect a machine should be able to understand? and what are the key dimensions that require the attention of researchers to make this dream come true

Crossref

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Archivio della ricerca- Università di Roma La Sapienza

Embedding Words and Senses Together via Joint Knowledge-Enhanced Training

Author: CAMACHO COLLADOS Jose'
Iacobacci IGNACIO JAVIER
Mancini Massimiliano
Navigli Roberto
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

Word embeddings are widely used in Nat-ural Language Processing, mainly due totheir success in capturing semantic infor-mation from massive corpora. However,their creation process does not allow thedifferent meanings of a word to be auto-matically separated, as it conflates theminto a single vector. We address this issueby proposing a new model which learnsword and sense embeddings jointly. Ourmodel exploits large corpora and knowl-edge from semantic networks in order toproduce a unified vector space of wordand sense embeddings. We evaluate themain features of our approach both qual-itatively and quantitatively in a variety oftasks, highlighting the advantages of theproposed method in comparison to state-of-the-art word- and sense-based models

arXiv.org e-Print Archive

Crossref

Online Research @ Cardiff

Archivio della ricerca- Università di Roma La Sapienza

Russian word sense induction by clustering averaged word embeddings

Author: Kutuzov Andrey
Publication venue
Publication date: 01/01/2018
Field of study

The paper reports our participation in the shared task on word sense induction and disambiguation for the Russian language (RUSSE-2018). Our team was ranked 2nd for the wiki-wiki dataset (containing mostly homonyms) and 5th for the bts-rnc and active-dict datasets (containing mostly polysemous words) among all 19 participants. The method we employed was extremely naive. It implied representing contexts of ambiguous words as averaged word embedding vectors, using off-the-shelf pre-trained distributional models. Then, these vector representations were clustered with mainstream clustering techniques, thus producing the groups corresponding to the ambiguous word senses. As a side result, we show that word embedding models trained on small but balanced corpora can be superior to those trained on large but noisy data - not only in intrinsic evaluation, but also in downstream tasks like word sense induction.Comment: Proceedings of the 24rd International Conference on Computational Linguistics and Intellectual Technologies (Dialogue-2018

arXiv.org e-Print Archive

NORA - Norwegian Open Research Archives