3 research outputs found

    IMPROVING OPINION MINING BY CLASSIFYING FACTS AND OPINIONS IN TWITTER - A DEEP LEARNING APPROACH

    Get PDF
    The massive social media data presents businesses with an immense opportunity to extract useful insights. However, social media messages typically consist of both facts and opinions, posing a challenge to analytics applications that focus more on either facts or opinions. Distinguishing facts and opinions from social media may significantly improve both, fact seeking applications that aims to capture breaking news, as well as user opinion seeking applications that aims to evaluate users\u27 sentiment towards an event or entity. Despite, the growing need, classifying facts from opinion in social media, has gained minimal attention. In this study we examine the limitation of applying existing, subjectivity detection methods that identifies subjective contents in textual data. In the context of social media, specifically in microblogs like Twitter, the content is dirty with respect to spelling, syntax, extensive usage of emoticons and abbreviation apart from the overall issue of data sparsity. Traditional methods of checking individual words against a predefined lexicon data set, do not often yield required accuracy for this task. Primary objective of this study is to address this limitation and provide an alternative method to improve this classification task and opinion mining in general. The study proposes usmg supplemental information from Twitter metadata and empirically demonstrates the improvement in performance. To ensure rigor and relevance, design science research methodology is adopted for this project. We propose a deep learning algorithm that automatically separates facts from opinions in Twitter messages. Our model combines bag-of-word features with selected manually-engineered features from Twitter metadata in a multipm1 experiment. We leverage an external reference dataset to develop our manually-engineered feature variables and evaluated efficiency against three external baseline tools. The study uses eight different machine learning classifiers to demonstrate the robustness of the manual feature set. Next, we combine these manually-engineered features with features extracted from bag-of-words model in our proposed deep learning model. Our algorithm significantly outperformed multiple popular baselines in the internal evaluation pm1 of the experiment. Next as part of practical usefulness, we illustrated how distinguishing facts and opinions can be useful in a real world business application. We applied our proposed algorithm to an external opinion mining application that tracks emerging customer complaints from social media conversation. We conducted our case study with three large financial institutions using Twitter data for a period of 16 weeks. The study observed considerable improvement in that external application after integrating our algorithm and concludes that it indeed benefit subsequent analytics applications

    Newsletter Fall 2017

    Get PDF

    Improving Opinion Mining by Classifying Facts and Opinions in Twitter

    No full text
    The proliferation of social media data presents an immense opportunity to extract useful business insights. The continuous flow of user-generated reviews, comments, recommendations, ratings, and feedbacks represents a co-mingled dataset of facts and opinions. Detecting and separating facts from opinions in social media will significantly improve subsequent opinion mining tasks. We present an algorithm that automatically separates facts from opinions in a social media corpus. We test our algorithm using Twitter data. The algorithm analyzes not only the actual text of the posts, but also the contextual metadata and supporting reference datasets. Our approach yielded an accuracy of 73.47% in classifying facts and opinion, compared to 52.55% accuracy of the baseline models. To further demonstrate its usefulness, we applied our algorithm in an external opinion mining application that leverages social media to track customer complaints. Results show that by integrating our algorithm, the application could achieve a 10% improvement in performance
    corecore