7,017 research outputs found
From Word to Sense Embeddings: A Survey on Vector Representations of Meaning
Over the past years, distributed semantic representations have proved to be
effective and flexible keepers of prior knowledge to be integrated into
downstream applications. This survey focuses on the representation of meaning.
We start from the theoretical background behind word vector space models and
highlight one of their major limitations: the meaning conflation deficiency,
which arises from representing a word with all its possible meanings as a
single vector. Then, we explain how this deficiency can be addressed through a
transition from the word level to the more fine-grained level of word senses
(in its broader acceptation) as a method for modelling unambiguous lexical
meaning. We present a comprehensive overview of the wide range of techniques in
the two main branches of sense representation, i.e., unsupervised and
knowledge-based. Finally, this survey covers the main evaluation procedures and
applications for this type of representation, and provides an analysis of four
of its important aspects: interpretability, sense granularity, adaptability to
different domains and compositionality.Comment: 46 pages, 8 figures. Published in Journal of Artificial Intelligence
Researc
Discriminating word senses with tourist walks in complex networks
Patterns of topological arrangement are widely used for both animal and human
brains in the learning process. Nevertheless, automatic learning techniques
frequently overlook these patterns. In this paper, we apply a learning
technique based on the structural organization of the data in the attribute
space to the problem of discriminating the senses of 10 polysemous words. Using
two types of characterization of meanings, namely semantical and topological
approaches, we have observed significative accuracy rates in identifying the
suitable meanings in both techniques. Most importantly, we have found that the
characterization based on the deterministic tourist walk improves the
disambiguation process when one compares with the discrimination achieved with
traditional complex networks measurements such as assortativity and clustering
coefficient. To our knowledge, this is the first time that such deterministic
walk has been applied to such a kind of problem. Therefore, our finding
suggests that the tourist walk characterization may be useful in other related
applications
Natural language understanding: instructions for (Present and Future) use
In this paper I look at Natural Language Understanding, an area of Natural Language Processing aimed at making sense of text, through the lens of a visionary future: what do we expect a machine should be able to understand? and what are the key dimensions that require the attention of researchers to make this dream come true
Learning Language from a Large (Unannotated) Corpus
A novel approach to the fully automated, unsupervised extraction of
dependency grammars and associated syntax-to-semantic-relationship mappings
from large text corpora is described. The suggested approach builds on the
authors' prior work with the Link Grammar, RelEx and OpenCog systems, as well
as on a number of prior papers and approaches from the statistical language
learning literature. If successful, this approach would enable the mining of
all the information needed to power a natural language comprehension and
generation system, directly from a large, unannotated corpus.Comment: 29 pages, 5 figures, research proposa
- …