Search CORE

521 research outputs found

Taking antonymy mask off in vector space

Author: Huang Chu Ren
LENCI ALESSANDRO
Lu Qin
Santus Enrico
Publication venue: place:Phuket, Thailand
Publication date: 01/01/2014
Field of study

Automatic detection of antonymy is an important task in Natural Language Processing (NLP) for Information Retrieval (IR), Ontology Learning (OL) and many other semantic applications. However, current unsupervised approaches to antonymy detection are still not fully effective because they cannot discriminate antonyms from synonyms. In this paper, we introduce APAnt, a new Average-Precision-based measure for the unsupervised discrimination of antonymy from synonymy using Distributional Semantic Models (DSMs). APAnt makes use of Average Precision to estimate the extent and salience of the intersection among the most descriptive contexts of two target words. Evaluation shows that the proposed method is able to distinguish antonyms and synonyms with high accuracy across different parts of speech, including nouns, adjectives and verbs. APAnt outperforms the vector cosine and a baseline model implementing the co-occurrence hypothesis

CiteSeerX

The Hong Kong Polytechnic University Pao Yue-kong Library

Archivio della Ricerca - Università di Pisa

Assessing Lexical-Semantic Regularities in Portuguese Word Embeddings

Author: Alves Ana
Gonçalo Oliveira Hugo
Sousa Tiago
Publication venue: 'Universidad Internacional de La Rioja'
Publication date: 01/01/2021
Field of study

Models of word embeddings are often assessed when solving syntactic and semantic analogies. Among the latter, we are interested in relations that one would find in lexical-semantic knowledge bases like WordNet, also covered by some analogy test sets for English. Briefly, this paper aims to study how well pretrained Portuguese word embeddings capture such relations. For this purpose, we created a new test, dubbed TALES, with an exclusive focus on Portuguese lexical-semantic relations, acquired from lexical resources. With TALES, we analyse the performance of methods previously used for solving analogies, on different models of Portuguese word embeddings. Accuracies were clearly below the state of the art in analogies of other kinds, which shows that TALES is a challenging test, mainly due to the nature of lexical-semantic relations, i.e., there are many instances sharing the same argument, thus allowing for several correct answers, sometimes too many to be all included in the dataset. We further inspect the results of the best performing combination of method and model to find that some acceptable answers had been considered incorrect. This was mainly due to the lack of coverage by the source lexical resources and suggests that word embeddings may be a useful source of information for enriching those resources, something we also discuss

Estudo Geral

Re-UNIR

Breaking NLI Systems with Sentences that Require Simple Lexical Inferences

Author: Glockner Max
Goldberg Yoav
Shwartz Vered
Publication venue
Publication date: 01/01/2018
Field of study

We create a new NLI test set that shows the deficiency of state-of-the-art models in inferences that require lexical and world knowledge. The new examples are simpler than the SNLI test set, containing sentences that differ by at most one word from sentences in the training set. Yet, the performance on the new test set is substantially worse across systems trained on SNLI, demonstrating that these systems are limited in their generalization ability, failing to capture many simple inferences.Comment: 6 pages, short paper at ACL 201

arXiv.org e-Print Archive

TUbiblio

Crossref