14 research outputs found

    Taxonomy Induction using Hypernym Subsequences

    Get PDF
    We propose a novel, semi-supervised approach towards domain taxonomy induction from an input vocabulary of seed terms. Unlike all previous approaches, which typically extract direct hypernym edges for terms, our approach utilizes a novel probabilistic framework to extract hypernym subsequences. Taxonomy induction from extracted subsequences is cast as an instance of the minimumcost flow problem on a carefully designed directed graph. Through experiments, we demonstrate that our approach outperforms stateof- the-art taxonomy induction approaches across four languages. Importantly, we also show that our approach is robust to the presence of noise in the input vocabulary. To the best of our knowledge, no previous approaches have been empirically proven to manifest noise-robustness in the input vocabulary

    SemEval-2016 Task 13: Taxonomy Extraction Evaluation (TExEval-2)

    Get PDF
    This paper describes the second edition of the shared task on Taxonomy Extraction Evaluation organised as part of SemEval 2016. This task aims to extract hypernym-hyponym relations between a given list of domain-specific terms and then to construct a domain taxonomy based on them. TExEval-2 introduced a multilingual setting for this task, covering four different languages including English, Dutch, Italian and French from domains as diverse as environment, food and science. A total of 62 runs submitted by 5 different teams were evaluated using structural measures, by comparison with gold standard taxonomies and by manual quality assessment of novel relations.Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289 (INSIGHT

    Improving Hypernymy Extraction with Distributional Semantic Classes

    Full text link
    In this paper, we show how distributionally-induced semantic classes can be helpful for extracting hypernyms. We present methods for inducing sense-aware semantic classes using distributional semantics and using these induced semantic classes for filtering noisy hypernymy relations. Denoising of hypernyms is performed by labeling each semantic class with its hypernyms. On the one hand, this allows us to filter out wrong extractions using the global structure of distributionally similar senses. On the other hand, we infer missing hypernyms via label propagation to cluster terms. We conduct a large-scale crowdsourcing study showing that processing of automatically extracted hypernyms using our approach improves the quality of the hypernymy extraction in terms of both precision and recall. Furthermore, we show the utility of our method in the domain taxonomy induction task, achieving the state-of-the-art results on a SemEval'16 task on taxonomy induction.Comment: In Proceedings of the 11th Conference on Language Resources and Evaluation (LREC 2018). Miyazaki, Japa

    A supervised approach to taxonomy extraction using word embeddings

    Get PDF
    Large collections of texts are commonly generated by large organizations and making sense of these collections of texts is a significant challenge. One method for handling this is to organize the concepts into a hierarchical structure such that similar concepts can be discovered and easily browsed. This approach was the subject of a recent evaluation campaign, TExEval, however the results of this task showed that none of the systems consistently outperformed a relatively simple baseline.In order to solve this issue, we propose a new method that uses supervised learning to combine multiple features with a support vector machine classifier including the baseline features. We show that this outperforms the baseline and thus provides a stronger method for identifying taxonomic relations than previous method

    Learning Word Subsumption Projections for the Russian Language

    Full text link
    The semantic relations of hypernymy and hyponymy are widely used in various natural language processing tasks for modelling the subsumptions in common sense reasoning. Since the popularisation of the distributional semantics, a significant attention is paid to applying word embeddings for inducing the relations between words. In this paper, we show our preliminary results on adopting the projection learning technique for computing hypernyms from hyponyms using word embeddings. We also conduct a series of experiments on the Russian language and release the open source software for learning hyponym-hypernym projections using both CPUs and GPUs, implemented with the TensorFlow machine learning framework

    KIND: Un proyecto de inducción automática de taxonomías léxicas

    Get PDF
    This paper presents a description of the Kind Project, an algorithm for automatic induction of lexical taxonomies from corpora. Taxonomy induction consists of the discovery of hypernymy relations between single or multiword noun pairs, and the integration of these pairs into larger structures. The proposed methodology is fundamentally statistical and the requirement of linguistic resources is minimal, a characteristic that facilitates the reproduction of experiments in different languages. The languages for which results have been obtained so far are Spanish, English and French. The implementation of the algorithm and an online demo are available as open source on the projects’ website.  Este artículo presenta una descripción del Proyecto Kind, un algoritmo para inducción automática de taxonomías léxicas a partir de corpus. La inducción de taxonomías consiste en el descubrimiento de relaciones de hiperonimia entre pares sustantivos, ya sea mono o poliléxicos, y en la integración de estos pares en estructuras mayores. La metodología propuesta es fundamentalmente estadística y tiene mínimo requerimiento de recursos lingüísticos, característica que facilita la reproducción de experimentos en distintas lenguas. Las lenguas con las que se ha experimentado hasta ahora son castellano, inglés y francés. La implementación del algoritmo y un demostrador en línea se encuentran disponibles como código abierto en el sitio web del proyecto
    corecore