Skip to main content
Article thumbnail
Location of Repository

Sentiment Classification with Supervised Sequence Embedding

By Dmitriy Bespalov, Yanjun Qi, Bing Bai and Ali Shokouf

Abstract

Abstract. In this paper, we introduce a novel approach for modeling n-grams in a latent space learned from supervised signals. The proposed procedure uses only unigram features to model short phrases (n-grams) in the latent space. The phrases are then combined to form document-level latent representation for a given text, where position of an n-gram in the document is used to compute corresponding combining weight. The resulting two-stage supervised embedding is then coupled with a classifier to form an end-to-end system that we apply to the large-scale sentiment classification task. The proposed model does not require feature selection to retain effective features during pre-processing, and its parameter space grows linearly with size of n-gram. We present comparative evaluations of this method using two large-scale datasets for sentiment classification in online reviews (Amazon and TripAdvisor). The proposed method outperforms standard baselines that rely on bag-of-words representation populated with n-gram features

Topics: Sentiment Classification, Large-Scale Text Mining, Supervised Feature Learning, Supervised Embedding
Year: 2013
OAI identifier: oai:CiteSeerX.psu:10.1.1.353.5887
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://www.cs.cmu.edu/~qyj/www... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.