SNIPPET-BASED UNSUPERVISED APPROACH FOR SENTIMENT CLASSIFICATION OF CHINESE ONLINE REVIEWS
- Publication date
- Publisher
Abstract
Sentiment classification seeks to identify general attitude of a piece of text of comments or reviews on certain subject, be it positive or negative. Most existing researches on sentiment classification employ supervised learning approaches that rely on annotated data. However, sentiment is expressed differently on different subjects in different domains, and having annotated corpora for every domain of interest is not always practical. This paper proposes an unsupervised learning approach for classifying text of online reviews as recommended or not recommended. The proposed method is based on search engine snippet, summary information on the result page of a search engine. A basic assumption is that terms with similar orientation tend to co-occur. The co-occurrence is measured by utilizing snippets returned from search engines, with a query consisting of the text and a seed positive or negative word. With the information of snippets, the proposed method may estimate the association of candidate terms more accurately. This allows us to reliably predict the sentiment orientation of customer reviews. Texts of customer reviews are then classified as recommended or not recommended if the average sentiment orientations of its phrases are positive or negative. The research data set of this study consists of 600 Chinese online reviews about travel destinations retrieved from Ctrip.com. Our approach achieves an accuracy of 76.5%. Factors that influence the accuracy of the sentiment classification of Chinese online reviews were discussed.Sentiment classification, unsupervised learning approach, snippet, online review, chinese