21,405 research outputs found
From Word to Sense Embeddings: A Survey on Vector Representations of Meaning
Over the past years, distributed semantic representations have proved to be
effective and flexible keepers of prior knowledge to be integrated into
downstream applications. This survey focuses on the representation of meaning.
We start from the theoretical background behind word vector space models and
highlight one of their major limitations: the meaning conflation deficiency,
which arises from representing a word with all its possible meanings as a
single vector. Then, we explain how this deficiency can be addressed through a
transition from the word level to the more fine-grained level of word senses
(in its broader acceptation) as a method for modelling unambiguous lexical
meaning. We present a comprehensive overview of the wide range of techniques in
the two main branches of sense representation, i.e., unsupervised and
knowledge-based. Finally, this survey covers the main evaluation procedures and
applications for this type of representation, and provides an analysis of four
of its important aspects: interpretability, sense granularity, adaptability to
different domains and compositionality.Comment: 46 pages, 8 figures. Published in Journal of Artificial Intelligence
Researc
Characterizing the impact of geometric properties of word embeddings on task performance
Analysis of word embedding properties to inform their use in downstream NLP
tasks has largely been studied by assessing nearest neighbors. However,
geometric properties of the continuous feature space contribute directly to the
use of embedding features in downstream models, and are largely unexplored. We
consider four properties of word embedding geometry, namely: position relative
to the origin, distribution of features in the vector space, global pairwise
distances, and local pairwise distances. We define a sequence of
transformations to generate new embeddings that expose subsets of these
properties to downstream models and evaluate change in task performance to
understand the contribution of each property to NLP models. We transform
publicly available pretrained embeddings from three popular toolkits (word2vec,
GloVe, and FastText) and evaluate on a variety of intrinsic tasks, which model
linguistic information in the vector space, and extrinsic tasks, which use
vectors as input to machine learning models. We find that intrinsic evaluations
are highly sensitive to absolute position, while extrinsic tasks rely primarily
on local similarity. Our findings suggest that future embedding models and
post-processing techniques should focus primarily on similarity to nearby
points in vector space.Comment: Appearing in the Third Workshop on Evaluating Vector Space
Representations for NLP (RepEval 2019). 7 pages + reference
Correcting Knowledge Base Assertions
The usefulness and usability of knowledge bases (KBs) is often limited by quality issues. One common issue is the presence of erroneous assertions, often caused by lexical or semantic confusion. We study the problem of correcting such assertions, and present a general correction framework which combines lexical matching, semantic embedding, soft constraint mining and semantic consistency checking. The framework is evaluated using DBpedia and an enterprise medical KB
Higher-Order Process Modeling: Product-Lining, Variability Modeling and Beyond
We present a graphical and dynamic framework for binding and execution of
business) process models. It is tailored to integrate 1) ad hoc processes
modeled graphically, 2) third party services discovered in the (Inter)net, and
3) (dynamically) synthesized process chains that solve situation-specific
tasks, with the synthesis taking place not only at design time, but also at
runtime. Key to our approach is the introduction of type-safe stacked
second-order execution contexts that allow for higher-order process modeling.
Tamed by our underlying strict service-oriented notion of abstraction, this
approach is tailored also to be used by application experts with little
technical knowledge: users can select, modify, construct and then pass
(component) processes during process execution as if they were data. We
illustrate the impact and essence of our framework along a concrete, realistic
(business) process modeling scenario: the development of Springer's
browser-based Online Conference Service (OCS). The most advanced feature of our
new framework allows one to combine online synthesis with the integration of
the synthesized process into the running application. This ability leads to a
particularly flexible way of implementing self-adaption, and to a particularly
concise and powerful way of achieving variability not only at design time, but
also at runtime.Comment: In Proceedings Festschrift for Dave Schmidt, arXiv:1309.455
- …