5 research outputs found
Acquiring Broad Commonsense Knowledge for Sentiment Analysis Using Human Computation
While artificial intelligence is successful in many applications that cover specific domains, for many commonsense problems there is still a large gap with human performance. Automated sentiment analysis is a typical example: while there are techniques that reasonably aggregate sentiments from texts in specific domains, such as online reviews of a particular product category, more general models have a poor performance. We argue that sentiment analysis can be covered more broadly by extending models with commonsense knowledge acquired at scale, using human computation. We study two sentiment analysis problems. We start with document-level sentiment classification, which aims to determine whether a text as a whole expresses a positive or a negative sentiment. We hypothesize that extending classifiers to include the polarities of sentiment words in context can help them scale to broad domains. We also study fine-grained opinion extraction, which aims to pinpoint individual opinions in a text, along with their targets. We hypothesize that extraction models can benefit from broad fine-grained annotations to boost their performance on unfamiliar domains. Selecting sentiment words in context and annotating texts with opinions and targets are tasks that require commonsense knowledge shared by all the speakers of a language. We show how these can be effectively solved through human computation. We illustrate how to define small tasks that can be solved by many independent workers so that results can form a single coherent knowledge base. We also show how to recruit, train, and engage workers, then how to perform effective quality control to obtain sufficiently high-quality knowledge. We show how the resulting knowledge can be effectively integrated into models that scale to broad domains and also perform well in unfamiliar domains. We engage workers through both enjoyment and payment, by designing our tasks as games played for money. We recruit them on a paid crowdsourcing platform where we can reach out to a large pool of active workers. This is an effective recipe for acquiring sentiment knowledge in English, a language that is known by the vast majority of workers on the platform. To acquire sentiment knowledge for other languages, which have received comparatively little attention, we argue that we need to design tasks that appeal to voluntary workers outside the crowdsourcing platform, based on enjoyment alone. However, recruiting and engaging volunteers has been more of an art than a problem that can be solved systematically. We show that combining online advertisement with games, an approach that has been recently proved to work well for acquiring expert knowledge, gives an effective recipe for luring and engaging volunteers to provide good quality sentiment knowledge for texts in French. Our solutions could point the way to how to use human computation to broaden the competence of artificial intelligence systems in other domains as well
A :) Is Worth a Thousand Words: How People Attach Sentiment to Emoticons and Words in Tweets
Emoticons are widely used to express positive or negative sentiment on Twitter. We report on a study with live users to determine whether emoticons are used to merely emphasize the sentiment of tweets, or whether they are the main elements carrying the sentiment. We found that the sentiment of an emoticon is in substantial agreement with the sentiment of the entire tweet. Thus, emoticons are useful as predictors of tweet sentiment and should not be ignored in sentiment classification. However, the sentiment expressed by an emoticon agrees with the sentiment of the accompanying text only slightly better than random. Thus, using the text accompanying emoticons to train sentiment models is not likely to produce the best results, a fact that we show by comparing lexicons generated using emoticons with others generated using simple textual features. © 2013 IEEE
Acquiring Commonsense Knowledge for Sentiment Analysis through Human Computation
Many Artificial Intelligence tasks need large amounts of commonsense knowledge. Because obtaining this knowledge through machine learning would require a huge amount of data, a better alternative is to elicit it from people through human computation. We consider the sentiment classification task, where knowledge about the contexts that impact word polarities is crucial, but hard to acquire from data. We describe a novel task design that allows us to crowdsource this knowledge through Amazon Mechanical Turk with high quality. We show that the commonsense knowledge acquired in this way dramatically improves the performance of established sentiment classification methods