Search CORE

24 research outputs found

Chasing Hypernyms in Vector Spaces with Entropy

Author: LENCI ALESSANDRO
Lu Q.
Santus E.
Schulte im Walde S.
Publication venue: EACL (European chapter of the Association for Computational Linguistics)
Publication date: 01/01/2014
Field of study

In this paper, we introduce SLQS, a new entropy-based measure for the unsupervised identification of hypernymy and its directionality in Distributional Semantic Models (DSMs). SLQS is assessed through two tasks: (i.) identifying the hypernym in hyponym-hypernym pairs, and (ii.) discriminating hypernymy among various semantic relations. In both tasks, SLQS outperforms other state-of-the-art measures

Archivio della Ricerca - Università di Pisa

Unsupervised Measure of Word Similarity: How to Outperform Co-occurrence and Vector Cosine in VSMs

Author: Chiu Tin-Shing
Huang Chu-Ren
Lenci Alessandro
Lu Qin
Santus Enrico
Publication venue
Publication date: 01/01/2016
Field of study

In this paper, we claim that vector cosine, which is generally considered among the most efficient unsupervised measures for identifying word similarity in Vector Space Models, can be outperformed by an unsupervised measure that calculates the extent of the intersection among the most mutually dependent contexts of the target words. To prove it, we describe and evaluate APSyn, a variant of the Average Precision that, without any optimization, outperforms the vector cosine and the co-occurrence on the standard ESL test set, with an improvement ranging between +9.00% and +17.98%, depending on the number of chosen top contexts.Comment: in AAAI 2016. arXiv admin note: substantial text overlap with arXiv:1603.0870

arXiv.org e-Print Archive

The Hong Kong Polytechnic University Pao Yue-kong Library

Association for the Advancement of Artificial Intelligence: AAAI Publications

ROOT13: Spotting Hypernyms, Co-Hyponyms and Randoms

Author: Chiu Tin-Shing
Huang Chu-Ren
Lenci Alessandro
Lu Qin
Santus Enrico
Publication venue
Publication date: 01/01/2016
Field of study

In this paper, we describe ROOT13, a supervised system for the classification of hypernyms, co-hyponyms and random words. The system relies on a Random Forest algorithm and 13 unsupervised corpus-based features. We evaluate it with a 10-fold cross validation on 9,600 pairs, equally distributed among the three classes and involving several Parts-Of-Speech (i.e. adjectives, nouns and verbs). When all the classes are present, ROOT13 achieves an F1 score of 88.3%, against a baseline of 57.6% (vector cosine). When the classification is binary, ROOT13 achieves the following results: hypernyms-co-hyponyms (93.4% vs. 60.2%), hypernymsrandom (92.3% vs. 65.5%) and co-hyponyms-random (97.3% vs. 81.5%). Our results are competitive with stateof-the-art models.Comment: in AAAI 201

arXiv.org e-Print Archive

The Hong Kong Polytechnic University Pao Yue-kong Library

Association for the Advancement of Artificial Intelligence: AAAI Publications

Path Ranking with Attention to Type Hierarchies

Author: Chernova Sonia
Daruna Angel
Kira Zsolt
Liu Weiyu
Publication venue
Publication date: 26/11/2019
Field of study

The objective of the knowledge base completion problem is to infer missing information from existing facts in a knowledge base. Prior work has demonstrated the effectiveness of path-ranking based methods, which solve the problem by discovering observable patterns in knowledge graphs, consisting of nodes representing entities and edges representing relations. However, these patterns either lack accuracy because they rely solely on relations or cannot easily generalize due to the direct use of specific entity information. We introduce Attentive Path Ranking, a novel path pattern representation that leverages type hierarchies of entities to both avoid ambiguity and maintain generalization. Then, we present an end-to-end trained attention-based RNN model to discover the new path patterns from data. Experiments conducted on benchmark knowledge base completion datasets WN18RR and FB15k-237 demonstrate that the proposed model outperforms existing methods on the fact prediction task by statistically significant margins of 26% and 10%, respectively. Furthermore, quantitative and qualitative analyses show that the path patterns balance between generalization and discrimination.Comment: Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

What a Nerd! Beating Students and Vector Cosine in the ESL and TOEFL Datasets

Author: Chiu Tin-Shing
Huang Chu-Ren
Lenci Alessandro
Lu Qin
Santus Enrico
Publication venue
Publication date: 01/01/2016
Field of study

In this paper, we claim that Vector Cosine, which is generally considered one of the most efficient unsupervised measures for identifying word similarity in Vector Space Models, can be outperformed by a completely unsupervised measure that evaluates the extent of the intersection among the most associated contexts of two target words, weighting such intersection according to the rank of the shared contexts in the dependency ranked lists. This claim comes from the hypothesis that similar words do not simply occur in similar contexts, but they share a larger portion of their most relevant contexts compared to other related words. To prove it, we describe and evaluate APSyn, a variant of Average Precision that, independently of the adopted parameters, outperforms the Vector Cosine and the co-occurrence on the ESL and TOEFL test sets. In the best setting, APSyn reaches 0.73 accuracy on the ESL dataset and 0.70 accuracy in the TOEFL dataset, beating therefore the non-English US college applicants (whose average, as reported in the literature, is 64.50%) and several state-of-the-art approaches.Comment: in LREC 201

arXiv.org e-Print Archive

The Hong Kong Polytechnic University Pao Yue-kong Library

Archivio della Ricerca - Università di Pisa

SemEval-2016 Task 13: Taxonomy Extraction Evaluation (TExEval-2)

Author: Bordea Georgeta
Buitelaar Paul
Lefever Els
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

This paper describes the second edition of the shared task on Taxonomy Extraction Evaluation organised as part of SemEval 2016. This task aims to extract hypernym-hyponym relations between a given list of domain-specific terms and then to construct a domain taxonomy based on them. TExEval-2 introduced a multilingual setting for this task, covering four different languages including English, Dutch, Italian and French from domains as diverse as environment, food and science. A total of 62 runs submitted by 5 different teams were evaluated using structural measures, by comparison with gold standard taxonomies and by manual quality assessment of novel relations.Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289 (INSIGHT

Ghent University Academic Bibliography

Irish Universities

Access to Research at National University of Ireland, Galway

Inferring Concept Hierarchies from Text Corpora via Hyperbolic Embeddings

Author: Kiela Douwe
Le Matt
Nickel Maximilian
Papaxanthos Laetitia
Roller Stephen
Publication venue
Publication date: 01/01/2019
Field of study

We consider the task of inferring is-a relationships from large text corpora. For this purpose, we propose a new method combining hyperbolic embeddings and Hearst patterns. This approach allows us to set appropriate constraints for inferring concept hierarchies from distributional contexts while also being able to predict missing is-a relationships and to correct wrong extractions. Moreover -- and in contrast with other methods -- the hierarchical nature of hyperbolic space allows us to learn highly efficient representations and to improve the taxonomic consistency of the inferred hierarchies. Experimentally, we show that our approach achieves state-of-the-art performance on several commonly-used benchmarks

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref