11 research outputs found
Taking antonymy mask off in vector space
Automatic detection of antonymy is an important task in Natural Language Processing (NLP) for Information Retrieval (IR), Ontology Learning (OL) and many other semantic applications. However, current unsupervised approaches to antonymy detection are still not fully effective because they cannot discriminate antonyms from synonyms. In this paper, we introduce APAnt, a new Average-Precision-based measure for the unsupervised discrimination of antonymy from synonymy using Distributional Semantic Models (DSMs). APAnt makes use of Average Precision to estimate the extent and salience of the intersection among the most descriptive contexts of two target words. Evaluation shows that the proposed method is able to distinguish antonyms and synonyms with high accuracy across different parts of speech, including nouns, adjectives and verbs. APAnt outperforms the vector cosine and a baseline model implementing the co-occurrence hypothesis
Unsupervised Measure of Word Similarity: How to Outperform Co-occurrence and Vector Cosine in VSMs
In this paper, we claim that vector cosine, which is generally considered
among the most efficient unsupervised measures for identifying word similarity
in Vector Space Models, can be outperformed by an unsupervised measure that
calculates the extent of the intersection among the most mutually dependent
contexts of the target words. To prove it, we describe and evaluate APSyn, a
variant of the Average Precision that, without any optimization, outperforms
the vector cosine and the co-occurrence on the standard ESL test set, with an
improvement ranging between +9.00% and +17.98%, depending on the number of
chosen top contexts.Comment: in AAAI 2016. arXiv admin note: substantial text overlap with
arXiv:1603.0870
What a Nerd! Beating Students and Vector Cosine in the ESL and TOEFL Datasets
In this paper, we claim that Vector Cosine, which is generally considered one
of the most efficient unsupervised measures for identifying word similarity in
Vector Space Models, can be outperformed by a completely unsupervised measure
that evaluates the extent of the intersection among the most associated
contexts of two target words, weighting such intersection according to the rank
of the shared contexts in the dependency ranked lists. This claim comes from
the hypothesis that similar words do not simply occur in similar contexts, but
they share a larger portion of their most relevant contexts compared to other
related words. To prove it, we describe and evaluate APSyn, a variant of
Average Precision that, independently of the adopted parameters, outperforms
the Vector Cosine and the co-occurrence on the ESL and TOEFL test sets. In the
best setting, APSyn reaches 0.73 accuracy on the ESL dataset and 0.70 accuracy
in the TOEFL dataset, beating therefore the non-English US college applicants
(whose average, as reported in the literature, is 64.50%) and several
state-of-the-art approaches.Comment: in LREC 201
Nine Features in a Random Forest to Learn Taxonomical Semantic Relations
ROOT9 is a supervised system for the classification of hypernyms, co-hyponyms
and random words that is derived from the already introduced ROOT13 (Santus et
al., 2016). It relies on a Random Forest algorithm and nine unsupervised
corpus-based features. We evaluate it with a 10-fold cross validation on 9,600
pairs, equally distributed among the three classes and involving several
Parts-Of-Speech (i.e. adjectives, nouns and verbs). When all the classes are
present, ROOT9 achieves an F1 score of 90.7%, against a baseline of 57.2%
(vector cosine). When the classification is binary, ROOT9 achieves the
following results against the baseline: hypernyms-co-hyponyms 95.7% vs. 69.8%,
hypernyms-random 91.8% vs. 64.1% and co-hyponyms-random 97.8% vs. 79.4%. In
order to compare the performance with the state-of-the-art, we have also
evaluated ROOT9 in subsets of the Weeds et al. (2014) datasets, proving that it
is in fact competitive. Finally, we investigated whether the system learns the
semantic relation or it simply learns the prototypical hypernyms, as claimed by
Levy et al. (2015). The second possibility seems to be the most likely, even
though ROOT9 can be trained on negative examples (i.e., switched hypernyms) to
drastically reduce this bias.Comment: in LREC 201
When Similarity Becomes Opposition: Synonyms and Antonyms Discrimination in DSMs
This paper analyzes the concept of opposition and describes a fully unsupervised method for its automatic discrimination from near-synonymy in Distributional Semantic Models (DSMs). The discriminating method is based on the hypothesis that, even though both near-synonyms and opposites are mostly distributionally similar, opposites are different from each other in at least one dimension of meaning, which can be assumed to be salient. Such hypothesis has been implemented in APAnt, a distributional measure that evaluates the extent of the intersection among the most relevant contexts of two words (where relevance is measured as mutual dependency), and its saliency (i.e. their average rank in the mutual dependency sorted list of contexts). The measure – previously introduced in some pilot studies – is presented here with two variants. Evaluation shows that it outperforms three baselines in an antonym retrieval task: the vector cosine, a baseline implementing the co-occurrence hypothesis, and a random rank. This paper describes the algorithm in details and analyzes its current limitations, suggesting that extensions may be developed for discriminating antonyms not only from near-synonyms but also from other semantic relations. During the evaluation, we have noticed that APAnt also has a particular preference for hypernyms
Similarity Models in Distributional Semantics using Task Specific Information
In distributional semantics, the unsupervised learning approach has been widely used for a large number of tasks. On the other hand, supervised learning has less coverage.
In this dissertation, we investigate the supervised learning approach for semantic relatedness tasks in distributional semantics. The investigation considers mainly semantic similarity and semantic classification tasks. Existing and newly-constructed datasets are used as an input for the experiments. The new datasets are constructed from thesauruses like Eurovoc. The Eurovoc thesaurus is a multilingual thesaurus maintained by the Publications Office of the European Union. The meaning of the words in the dataset is represented by using a distributional semantic approach.
The distributional semantic approach collects co-occurrence information from large texts and represents the words in high-dimensional vectors. The English words are represented by using UkWaK corpus while German words are represented by using DeWaC corpus. After representing each word by the high dimensional vector, different supervised machine learning methods are used on the selected tasks. The outputs from the supervised machine learning methods are evaluated by comparing the tasks performance and accuracy with the state of the art unsupervised machine learning methods’ results. In addition, multi-relational matrix factorization is introduced as one supervised learning method in distributional semantics. This dissertation shows the multi-relational matrix factorization method as a good alternative method to integrate different sources of information of words in distributional semantics.
In the dissertation, some new applications are also introduced. One of the applications is an application which analyzes a German company’s website text, and provides information about the company with a concept cloud visualization. The other applications are automatic recognition/disambiguation of the library of congress subject headings and automatic identification of synonym relations in the Dutch Parliament thesaurus applications