Search CORE

5,593 research outputs found

Embedding Words and Senses Together via Joint Knowledge-Enhanced Training

Author: CAMACHO COLLADOS Jose'
Iacobacci IGNACIO JAVIER
Mancini Massimiliano
Navigli Roberto
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

Word embeddings are widely used in Nat-ural Language Processing, mainly due totheir success in capturing semantic infor-mation from massive corpora. However,their creation process does not allow thedifferent meanings of a word to be auto-matically separated, as it conflates theminto a single vector. We address this issueby proposing a new model which learnsword and sense embeddings jointly. Ourmodel exploits large corpora and knowl-edge from semantic networks in order toproduce a unified vector space of wordand sense embeddings. We evaluate themain features of our approach both qual-itatively and quantitatively in a variety oftasks, highlighting the advantages of theproposed method in comparison to state-of-the-art word- and sense-based models

arXiv.org e-Print Archive

Crossref

Online Research @ Cardiff

Archivio della ricerca- Università di Roma La Sapienza

From Word to Sense Embeddings: A Survey on Vector Representations of Meaning

Author: Camacho-Collados Jose
Pilehvar Mohammad Taher
Publication venue
Publication date: 26/10/2018
Field of study

Over the past years, distributed semantic representations have proved to be effective and flexible keepers of prior knowledge to be integrated into downstream applications. This survey focuses on the representation of meaning. We start from the theoretical background behind word vector space models and highlight one of their major limitations: the meaning conflation deficiency, which arises from representing a word with all its possible meanings as a single vector. Then, we explain how this deficiency can be addressed through a transition from the word level to the more fine-grained level of word senses (in its broader acceptation) as a method for modelling unambiguous lexical meaning. We present a comprehensive overview of the wide range of techniques in the two main branches of sense representation, i.e., unsupervised and knowledge-based. Finally, this survey covers the main evaluation procedures and applications for this type of representation, and provides an analysis of four of its important aspects: interpretability, sense granularity, adaptability to different domains and compositionality.Comment: 46 pages, 8 figures. Published in Journal of Artificial Intelligence Researc

arXiv.org e-Print Archive

Online Research @ Cardiff

Russian word sense induction by clustering averaged word embeddings

Author: Kutuzov Andrey
Publication venue
Publication date: 01/01/2018
Field of study

The paper reports our participation in the shared task on word sense induction and disambiguation for the Russian language (RUSSE-2018). Our team was ranked 2nd for the wiki-wiki dataset (containing mostly homonyms) and 5th for the bts-rnc and active-dict datasets (containing mostly polysemous words) among all 19 participants. The method we employed was extremely naive. It implied representing contexts of ambiguous words as averaged word embedding vectors, using off-the-shelf pre-trained distributional models. Then, these vector representations were clustered with mainstream clustering techniques, thus producing the groups corresponding to the ambiguous word senses. As a side result, we show that word embedding models trained on small but balanced corpora can be superior to those trained on large but noisy data - not only in intrinsic evaluation, but also in downstream tasks like word sense induction.Comment: Proceedings of the 24rd International Conference on Computational Linguistics and Intellectual Technologies (Dialogue-2018

arXiv.org e-Print Archive

NORA - Norwegian Open Research Archives

Natural language understanding: instructions for (Present and Future) use

Author: Navigli R.
Publication venue: 'International Joint Conferences on Artificial Intelligence'
Publication date: 01/01/2018
Field of study

In this paper I look at Natural Language Understanding, an area of Natural Language Processing aimed at making sense of text, through the lens of a visionary future: what do we expect a machine should be able to understand? and what are the key dimensions that require the attention of researchers to make this dream come true

Crossref

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Archivio della ricerca- Università di Roma La Sapienza

Deriving Verb Predicates By Clustering Verbs with Arguments

Author: Rouhizadeh Masoud
Schwartz Andy
Sedoc Joao
Ungar Lyle
Wijaya Derry
Publication venue
Publication date: 01/01/2017
Field of study

Hand-built verb clusters such as the widely used Levin classes (Levin, 1993) have proved useful, but have limited coverage. Verb classes automatically induced from corpus data such as those from VerbKB (Wijaya, 2016), on the other hand, can give clusters with much larger coverage, and can be adapted to specific corpora such as Twitter. We present a method for clustering the outputs of VerbKB: verbs with their multiple argument types, e.g. "marry(person, person)", "feel(person, emotion)." We make use of a novel low-dimensional embedding of verbs and their arguments to produce high quality clusters in which the same verb can be in different clusters depending on its argument type. The resulting verb clusters do a better job than hand-built clusters of predicting sarcasm, sentiment, and locus of control in tweets

arXiv.org e-Print Archive

Boston University Institutional Repository (OpenBU)

Embeddings for word sense disambiguation: an evaluation study

Author: Iacobacci IGNACIO JAVIER
Navigli Roberto
Pilehvar MOHAMMED TAHER
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

Recent years have seen a dramatic growth in the popularity of word embeddings mainly owing to their ability to capture semantic information from massive amounts of textual content. As a result, many tasks in Natural Language Processing have tried to take advantage of the potential of these distributional models. In this work, we study how word embeddings can be used in Word Sense Disambiguation, one of the oldest tasks in Natural Language Processing and Artificial Intelligence. We propose different methods through which word embeddings can be leveraged in a state-of-the-art supervised WSD system architecture, and perform a deep analysis of how different parameters affect performance. We show how a WSD system that makes use of word embeddings alone, if designed properly, can provide significant performance improvement over a state-of-the-art WSD system that incorporates several standard WSD features

Archivio della ricerca- Università di Roma La Sapienza

Using Neural Networks for Relation Extraction from Biomedical Literature

Author: A Koike
A Lamurias
A Lamurias
A Lamurias
A Lamurias
A Singhal
AV Aho
B Xu
CD Manning
CH Alves
D Westergaard
D Zhou
E Guresen
F Rinaldi
HC Wang
HM Müller
J Hastings
L Aroyo
M Ashburner
MY Kim
N Ma
N Peng
P Goyal
P Zweigenbaum
PN Robinson
Q Li
QL Nguyen
S HayKin
S Hochreiter
TR Gruber
W Wang
WWM Fleuren
Y Hao
Y Luo
Y Xu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/09/2020
Field of study

Using different sources of information to support automated extracting of relations between biomedical concepts contributes to the development of our understanding of biological systems. The primary comprehensive source of these relations is biomedical literature. Several relation extraction approaches have been proposed to identify relations between concepts in biomedical literature, namely, using neural networks algorithms. The use of multichannel architectures composed of multiple data representations, as in deep neural networks, is leading to state-of-the-art results. The right combination of data representations can eventually lead us to even higher evaluation scores in relation extraction tasks. Thus, biomedical ontologies play a fundamental role by providing semantic and ancestry information about an entity. The incorporation of biomedical ontologies has already been proved to enhance previous state-of-the-art results.Comment: Artificial Neural Networks book (Springer) - Chapter 1

arXiv.org e-Print Archive

Crossref