Skip to main content
Article thumbnail
Location of Repository

Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews

By Peter D. Turney

Abstract

This paper presents a simple unsupervised learning algorithm for classifying reviews as recommended (thumbs up) or not recommended (thumbs down). The classification of a review is predicted by the average semantic orientation of the phrases in the review that contain adjectives or adverbs. A phrase has a positive semantic orientation when it has good associations (e.g., "subtle nuances") and a negative semantic orientation when it has bad associations (e.g., "very cavalier"). In this paper, the semantic orientation of a phrase is calculated as the mutual information between the given phrase and the word "excellent" minus the mutual information between the given phrase and the word "poor". A review is classified as recommended if the average semantic orientation of its phrases is positive. The algorithm achieves an average accuracy of 74% when evaluated on 410 reviews from Epinions, sampled from four different domains (reviews of automobiles, banks, movies, and travel destinations). The accuracy ranges from 84% for automobile reviews to 66% for movie reviews

Topics: Artificial Intelligence, Language, Machine Learning, Statistical Models
Year: 2002
OAI identifier: oai:cogprints.org:2321
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://cogprints.org/2321/5/tu... (external link)
  • http://cogprints.org/2321/1/tu... (external link)
  • http://cogprints.org/2321/ (external link)
  • Suggested articles

    Citations

    1. (2001). A simple approach to ordinal classification.
    2. (1997). A solution to Plato’s problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge.
    3. (1996). An introduction to categorical data analysis.
    4. (1992). Direction-based text interpretation as an information access refinement. In
    5. (2000). Effects of adjective orientation and gradability on sentence subjectivity.
    6. (1995). Part-of-Speech Tagging Guidelines for the Penn Treebank Project (3rd revision, 2nd printing).
    7. (1997). Predicting the semantic orientation of adjectives.
    8. (1997). Smokey: Automatic recognition of hostile messages.
    9. (1994). Some advances in transformation-based part of speech tagging.
    10. (1989). Word association norms, mutual information and lexicography.

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.