Search CORE

146 research outputs found

Distributional Inclusion Vector Embedding for Unsupervised Hypernymy Detection

Author: Chang Haw-Shiuan
McCallum Andrew
Vilnis Luke
Wang ZiYun
Publication venue
Publication date: 01/01/2018
Field of study

Modeling hypernymy, such as poodle is-a dog, is an important generalization aid to many NLP tasks, such as entailment, coreference, relation extraction, and question answering. Supervised learning from labeled hypernym sources, such as WordNet, limits the coverage of these models, which can be addressed by learning hypernyms from unlabeled text. Existing unsupervised methods either do not scale to large vocabularies or yield unacceptably poor accuracy. This paper introduces distributional inclusion vector embedding (DIVE), a simple-to-implement unsupervised method of hypernym discovery via per-word non-negative vector embeddings which preserve the inclusion property of word contexts in a low-dimensional and interpretable space. In experimental evaluations more comprehensive than any previous literature of which we are aware-evaluating on 11 datasets using multiple existing as well as newly proposed scoring functions-we find that our method provides up to double the precision of previous unsupervised embeddings, and the highest average performance, using a much more compact word representation, and yielding many new state-of-the-art results.Comment: NAACL 201

arXiv.org e-Print Archive

Crossref

Inferring Concept Hierarchies from Text Corpora via Hyperbolic Embeddings

Author: Kiela Douwe
Le Matt
Nickel Maximilian
Papaxanthos Laetitia
Roller Stephen
Publication venue
Publication date: 01/01/2019
Field of study

We consider the task of inferring is-a relationships from large text corpora. For this purpose, we propose a new method combining hyperbolic embeddings and Hearst patterns. This approach allows us to set appropriate constraints for inferring concept hierarchies from distributional contexts while also being able to predict missing is-a relationships and to correct wrong extractions. Moreover -- and in contrast with other methods -- the hierarchical nature of hyperbolic space allows us to learn highly efficient representations and to improve the taxonomic consistency of the inferred hierarchies. Experimentally, we show that our approach achieves state-of-the-art performance on several commonly-used benchmarks

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref

Hypernym Detection Using Strict Partial Order Networks

Author: Chowdhury Md Faisal Mahbub
Dash Sarthak
Fauceglia Nicolas Rodolfo
Gliozzo Alfio
Mihindukulasooriya Nandana
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 22/11/2019
Field of study

This paper introduces Strict Partial Order Networks (SPON), a novel neural network architecture designed to enforce asymmetry and transitive properties as soft constraints. We apply it to induce hypernymy relations by training with is-a pairs. We also present an augmented variant of SPON that can generalize type information learned for in-vocabulary terms to previously unseen ones. An extensive evaluation over eleven benchmarks across different tasks shows that SPON consistently either outperforms or attains the state of the art on all but one of these benchmarks.Comment: 8 page

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Fully-unsupervised embeddings-based hypernym discovery

Author: Atzori Maurizio
Balloccu Simone
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

Funding: Supported in part by Sardegna Ricerche project OKgraph (CRP 120) and MIUR MIUR PRIN 2017 (2019-2022) project HOPE—High quality Open data Publishing and Enrichment.Peer reviewedPublisher PD

Multidisciplinary Digital Publishing Institute

Aberdeen University Research

Archivio istituzionale della ricerca - Università di Cagliari

Context Mover's Distance & Barycenters: Optimal Transport of Contexts for Building Representations

Author: Dieuleveut Aymeric
Hug Andreas
Jaggi Martin
Singh Sidak Pal
Publication venue
Publication date: 29/02/2020
Field of study

We present a framework for building unsupervised representations of entities and their compositions, where each entity is viewed as a probability distribution rather than a vector embedding. In particular, this distribution is supported over the contexts which co-occur with the entity and are embedded in a suitable low-dimensional space. This enables us to consider representation learning from the perspective of Optimal Transport and take advantage of its tools such as Wasserstein distance and barycenters. We elaborate how the method can be applied for obtaining unsupervised representations of text and illustrate the performance (quantitatively as well as qualitatively) on tasks such as measuring sentence similarity, word entailment and similarity, where we empirically observe significant gains (e.g., 4.1% relative improvement over Sent2vec, GenSen). The key benefits of the proposed approach include: (a) capturing uncertainty and polysemy via modeling the entities as distributions, (b) utilizing the underlying geometry of the particular task (with the ground cost), (c) simultaneously providing interpretability with the notion of optimal transport between contexts and (d) easy applicability on top of existing point embedding methods. The code, as well as prebuilt histograms, are available under https://github.com/context-mover/.Comment: AISTATS 2020. Also, accepted previously at ICLR 2019 DeepGenStruct Worksho

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne