Search CORE

22 research outputs found

Learning to distinguish hypernyms and co-hyponyms

Author: Clarke Daoud
Keller Bill
Reffin Jeremy
Weeds Julie
Weir David
Publication venue: Dublin City University and Association for Computational Linguistics
Publication date: 01/08/2014
Field of study

This work is concerned with distinguishing different semantic relations which exist between distributionally similar words. We compare a novel approach based on training a linear Support Vector Machine on pairs of feature vectors with state-of-the-art methods based on distributional similarity. We show that the new supervised approach does better even when there is minimal information about the target words in the training data, giving a 15% reduction in error rate over unsupervised approaches

Sussex Research Online

Recommended from our members

Injecting Lexical Contrast into Word Vectors by Guiding Vector Space Specialisation

Author: Korhonen A
Vulić I
Publication venue: Proceedings of the Annual Meeting of the Association for Computational Linguistics
Publication date: 01/01/2018
Field of study

Word vector space specialisation models offer a portable, light-weight approach to fine-tuning arbitrary distributional vector spaces to discern between synonymy and antonymy. Their effectiveness is drawn from external linguistic constraints that specify the exact lexical relation between words. In this work, we show that a careful selection of the external constraints can steer and improve the specialisation. By simply selecting appropriate constraints, we report state-of-the-art results on a suite of tasks with well-defined benchmarks where modeling lexical contrast is crucial: 1) true semantic similarity, with highest reported scores on SimLex-999 and SimVerb-3500 to date; 2) detecting antonyms; and 3) distinguishing antonyms from synonyms

Apollo (Cambridge)

Semantic Similarity Analysis for Paraphrase Identification in Arabic Texts

Author: Mahmoud Adnen
Zrigui Mounir
Publication venue: the National University (Philippines)
Publication date: 01/01/2017
Field of study

Waseda University Repository

Retrofitting Word Vectors to Semantic Lexicons

Author: Dodge Jesse
Dyer Chris
Faruqui Manaal
Hovy Eduard
Jauhar Sujay K.
Smith Noah A.
Publication venue
Publication date: 01/01/2015
Field of study

Vector space word representations are learned from distributional information of words in large corpora. Although such statistics are semantically informative, they disregard the valuable information that is contained in semantic lexicons such as WordNet, FrameNet, and the Paraphrase Database. This paper proposes a method for refining vector space representations using relational information from semantic lexicons by encouraging linked words to have similar vector representations, and it makes no assumptions about how the input vectors were constructed. Evaluated on a battery of standard lexical semantic evaluation tasks in several languages, we obtain substantial improvements starting with a variety of word vector models. Our refinement method outperforms prior techniques for incorporating semantic lexicons into the word vector training algorithms.Comment: Proceedings of NAACL 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Predicting the age of social network users from user-generated texts with word embeddings

Author: Alekseev A.
Nikolenko S.
Publication venue
Publication date: 01/01/2017
Field of study

© 2016 FRUCT.Many web-based applications such as advertising or recommender systems often critically depend on the demographic information, which may be unavailable for new or anonymous users. We study the problem of predicting demographic information based on user-generated texts on a Russian-language dataset from a large social network. We evaluate the efficiency of age prediction algorithms based on word2vec word embeddings and conduct a comprehensive experimental evaluation, comparing these algorithms with each other and with classical baseline approaches

Kazan Federal University Digital Repository

Explicit retrofitting of distributional word vectors

Author: Glavaš G
Vulić I
Publication venue: ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)
Publication date: 01/01/2018
Field of study

Semantic specialization of distributional word vectors, referred to as retrofitting, is a process of fine-tuning word vectors using external lexical knowledge in order to better embed some semantic relation. Existing retrofitting models integrate linguistic constraints directly into learning objectives and, consequently, specialize only the vectors of words from the constraints. In this work, in contrast, we transform external lexico-semantic relations into training examples which we use to learn an explicit retrofitting model (ER). The ER model allows us to learn a global specialization function and specialize the vectors of words unobserved in the training data as well. We report large gains over original distributional vector spaces in (1) intrinsic word similarity evaluation and on (2) two downstream tasks -- lexical simplification and dialog state tracking. Finally, we also successfully specialize vector spaces of new languages (i.e., unseen in the training data) by coupling ER with shared multilingual distributional vector spaces

Crossref

MAnnheim DOCument Server

Apollo (Cambridge)

a multitask objective to inject lexical contrast into distributional semantics

Author: Angeliki Lazaridou
Marco Baroni
Nghia The Pham
Publication venue
Publication date: 01/01/2015
Field of study

Distributional semantic models have trouble distinguishing strongly contrasting words (such as antonyms) from highly compatible ones (such as synonyms), because both kinds tend to occur in similar contexts in corpora. We introduce the multitask Lexical Contrast Model (mLCM), an extension of the effective Skip-gram method that optimizes semantic vectors on the joint tasks of predicting corpus contexts and making the representations of WordNet synonyms closer than that of matching WordNet antonyms. mLCM outperforms Skip-gram both on general semantic tasks and on synonym/antonym discrimination, even when no direct lexical contrast information about the test words is provided during training. mLCM also shows promising results on the task of learning a compositional negation operator mapping adjectives to their antonyms

Crossref

Open Access Repository

Extending latent semantic analysis to manage its syntactic blindness

Author: KORKONTZELOS YANNIS
SULEMAN RAJA MUHAMMAD
Publication venue
Publication date: 01/03/2021
Field of study

Edge Hill University Research Information Repository

Revisiting Word Embedding for Contrasting Meaning

Author: Hui Jiang
Qian Chen
Si Wei
Wei Lin
Xiaodan Zhu
Xiaoping Chen
Zhigang Chen
Publication venue
Publication date: 01/01/2015
Field of study

Contrasting meaning is a basic aspect of semantics. Recent word-embedding models based on distributional semantics hypothesis are known to be weak for modeling lexical contrast. We present in this paper the embedding models that achieve an F-score of 92% on the widely-used, publicly available dataset, the GRE "most contrasting word" questions (Mohammad et al., 2008). This is the highest performance seen so far on this dataset. Surprisingly at the first glance, unlike what was suggested in most previous work, where relatedness statistics learned from corpora is claimed to yield extra gains over lexicon-based models, we obtained our best result relying solely on lexical resources (Roget's and WordNet)—corpora statistics did not lead to further improvement. However, this should not be simply taken as that distributional statistics is not useful. We examine several basic concerns in modeling contrasting meaning to provide detailed analysis, with the aim to shed some light on the future directions for this basic semantics modeling problem

Crossref

Open Access Repository