146 research outputs found
Distributional Inclusion Vector Embedding for Unsupervised Hypernymy Detection
Modeling hypernymy, such as poodle is-a dog, is an important generalization
aid to many NLP tasks, such as entailment, coreference, relation extraction,
and question answering. Supervised learning from labeled hypernym sources, such
as WordNet, limits the coverage of these models, which can be addressed by
learning hypernyms from unlabeled text. Existing unsupervised methods either do
not scale to large vocabularies or yield unacceptably poor accuracy. This paper
introduces distributional inclusion vector embedding (DIVE), a
simple-to-implement unsupervised method of hypernym discovery via per-word
non-negative vector embeddings which preserve the inclusion property of word
contexts in a low-dimensional and interpretable space. In experimental
evaluations more comprehensive than any previous literature of which we are
aware-evaluating on 11 datasets using multiple existing as well as newly
proposed scoring functions-we find that our method provides up to double the
precision of previous unsupervised embeddings, and the highest average
performance, using a much more compact word representation, and yielding many
new state-of-the-art results.Comment: NAACL 201
Inferring Concept Hierarchies from Text Corpora via Hyperbolic Embeddings
We consider the task of inferring is-a relationships from large text corpora.
For this purpose, we propose a new method combining hyperbolic embeddings and
Hearst patterns. This approach allows us to set appropriate constraints for
inferring concept hierarchies from distributional contexts while also being
able to predict missing is-a relationships and to correct wrong extractions.
Moreover -- and in contrast with other methods -- the hierarchical nature of
hyperbolic space allows us to learn highly efficient representations and to
improve the taxonomic consistency of the inferred hierarchies. Experimentally,
we show that our approach achieves state-of-the-art performance on several
commonly-used benchmarks
Hypernym Detection Using Strict Partial Order Networks
This paper introduces Strict Partial Order Networks (SPON), a novel neural
network architecture designed to enforce asymmetry and transitive properties as
soft constraints. We apply it to induce hypernymy relations by training with
is-a pairs. We also present an augmented variant of SPON that can generalize
type information learned for in-vocabulary terms to previously unseen ones. An
extensive evaluation over eleven benchmarks across different tasks shows that
SPON consistently either outperforms or attains the state of the art on all but
one of these benchmarks.Comment: 8 page
Fully-unsupervised embeddings-based hypernym discovery
Funding: Supported in part by Sardegna Ricerche project OKgraph (CRP 120) and MIUR MIUR PRIN 2017 (2019-2022) project HOPE—High quality Open data Publishing and Enrichment.Peer reviewedPublisher PD
Context Mover's Distance & Barycenters: Optimal Transport of Contexts for Building Representations
We present a framework for building unsupervised representations of entities
and their compositions, where each entity is viewed as a probability
distribution rather than a vector embedding. In particular, this distribution
is supported over the contexts which co-occur with the entity and are embedded
in a suitable low-dimensional space. This enables us to consider representation
learning from the perspective of Optimal Transport and take advantage of its
tools such as Wasserstein distance and barycenters. We elaborate how the method
can be applied for obtaining unsupervised representations of text and
illustrate the performance (quantitatively as well as qualitatively) on tasks
such as measuring sentence similarity, word entailment and similarity, where we
empirically observe significant gains (e.g., 4.1% relative improvement over
Sent2vec, GenSen).
The key benefits of the proposed approach include: (a) capturing uncertainty
and polysemy via modeling the entities as distributions, (b) utilizing the
underlying geometry of the particular task (with the ground cost), (c)
simultaneously providing interpretability with the notion of optimal transport
between contexts and (d) easy applicability on top of existing point embedding
methods. The code, as well as prebuilt histograms, are available under
https://github.com/context-mover/.Comment: AISTATS 2020. Also, accepted previously at ICLR 2019 DeepGenStruct
Worksho
- …