2 research outputs found
Leveraging Sentiment to Compute Word Similarity
In this paper, we introduce a new WordNet based similarity metric, SenSim, which incorporates sentiment content (i.e., degree of positive or negative sentiment) of the words being compared to measure the similarity. The proposed metric is based on the hypothesis that knowing the sentiment is beneficial in measuring the similarity. To verify this hypothesis, we measure and compare the annotator agreement for 2 annotation strategies: 1) sentiment information of a pair of words is considered while annotating and 2) sentiment information of a pair of words is not considered while annotating. Inter-annotator correlation scores show that the agreement is better when the two annotators consider sentiment information while assigning a similarity score to a pair of words. We use this hypothesis to measure the similarity between a pair of words. Specifically, we represent each word as a vector containing the sentiment scores of all the content words in the WordNet gloss of the words. These sentiment scores are derived from a sentiment lexicon. We then measure the cosine similarity between the two vectors. We perform both intrinsic and extrinsic evaluation of SenSim. As a part of intrinsic evaluation, we calculate the correlation score with gold standard data and compare it with other popular WordNet based metrics. We find that SenSim has better correlation than other similarity metrics. Further, as a part of extrinsic evaluation, we use Sen-Sim in an application. We evaluate SenSim for mitigating unknown feature problem in supervised sentiment classification using replacement strategy based on similarity metrics as proposed by Balamurali et al. (2011). Our results show that new metric performs better than all the existing metrics used for comparison.
Leveraging Sentiment to Compute Word Similarity
In this paper, we introduce a new WordNet based similarity metric, SenSim,
which incorporates sentiment content (i.e., degree of positive or negative
sentiment) of the words being compared to measure the similarity between them.
The proposed metric is based on the hypothesis that knowing the sentiment is
beneficial in measuring the similarity. To verify this hypothesis, we measure
and compare the annotator agreement for 2 annotation strategies: 1) sentiment
information of a pair of words is considered while annotating and 2) sentiment
information of a pair of words is not considered while annotating.
Inter-annotator correlation scores show that the agreement is better when the
two annotators consider sentiment information while assigning a similarity
score to a pair of words. We use this hypothesis to measure the similarity
between a pair of words. Specifically, we represent each word as a vector
containing sentiment scores of all the content words in the WordNet gloss of
the sense of that word. These sentiment scores are derived from a sentiment
lexicon. We then measure the cosine similarity between the two vectors. We
perform both intrinsic and extrinsic evaluation of SenSim and compare the
performance with other widely usedWordNet similarity metrics.Comment: The paper is available at
http://subhabrata-mukherjee.webs.com/publications.ht