1,405 research outputs found

    Role of sentiment classification in sentiment analysis: a survey

    Get PDF
    Through a survey of literature, the role of sentiment classification in sentiment analysis has been reviewed. The review identifies the research challenges involved in tackling sentiment classification. A total of 68 articles during 2015 – 2017 have been reviewed on six dimensions viz., sentiment classification, feature extraction, cross-lingual sentiment classification, cross-domain sentiment classification, lexica and corpora creation and multi-label sentiment classification. This study discusses the prominence and effects of sentiment classification in sentiment evaluation and a lot of further research needs to be done for productive results

    Cross-domain sentiment classification using a sentiment sensitive thesaurus

    Get PDF
    Automatic classification of sentiment is important for numerous applications such as opinion mining, opinion summarization, contextual advertising, and market analysis. However, sentiment is expressed differently in different domains, and annotating corpora for every possible domain of interest is costly. Applying a sentiment classifier trained using labeled data for a particular domain to classify sentiment of user reviews on a different domain often results in poor performance. We propose a method to overcome this problem in cross-domain sentiment classification. First, we create a sentiment sensitive distributional thesaurus using labeled data for the source domains and unlabeled data for both source and target domains. Sentiment sensitivity is achieved in the thesaurus by incorporating document level sentiment labels in the context vectors used as the basis for measuring the distributional similarity between words. Next, we use the created thesaurus to expand feature vectors during train and test times in a binary classifier. The proposed method significantly outperforms numerous baselines and returns results that are comparable with previously proposed cross-domain sentiment classification methods. We conduct an extensive empirical analysis of the proposed method on single and multi-source domain adaptation, unsupervised and supervised domain adaptation, and numerous similarity measures for creating the sentiment sensitive thesaurus

    A FRAMEWORK FOR ARABIC SENTIMENT ANALYSIS USING MACHINE LEARNING CLASSIFIERS

    Get PDF
    International audienceIn recent years, the use of Internet and online comments, expressed in natural language text, have increased significantly. However, it is difficult for humans to read all these comments and classify them appropriately. Consequently, an automatic approach is required to classify the unstructured data. In this paper, we propose a framework for Arabic language comprising of three steps: pre-processing, feature extraction and machine learning classification. The main aim of the proposed framework is to exploit the combination of different Arabic linguistic features. We evaluate the framework using two benchmark Arabic tweets datasets (ASTD, ATA), which enable sentiment polarity detection in general Arabic and Jordanian dialects. Comparative simulation results show that machine learning classifiers such as Support Vector Machine (SVM), Naive Bayes, MultiLayer Perceptron (MLP) and Logistic Regression-based produce the best performance by using a combination of n-gram features from Arabic tweets datasets. Finally, we evaluate the performance of our proposed framework using an Ensemble classifier approach, with promising results

    Video advertisement mining for predicting revenue using random forest

    Get PDF
    Shaken by the threat of financial crisis in 2008, industries began to work on the topic of predictive analytics to efficiently control inventory levels and minimize revenue risks. In this third-generation age of web-connected data, organizations emphasized the importance of data science and leveraged the data mining techniques for gaining a competitive edge. Consider the features of Web 3.0, where semantic-oriented interaction between humans and computers can offer a tailored service or product to meet consumers\u27 needs by means of learning their preferences. In this study, we concentrate on the area of marketing science to demonstrate the correlation between TV commercial advertisements and sales achievement. Through different data mining and machine-learning methods, this research will come up with one concrete and complete predictive framework to clarify the effects of word of mouth by using open data sources from YouTube. The uniqueness of this predictive model is that we adopt the sentiment analysis as one of our predictors. This research offers a preliminary study on unstructured marketing data for further business use

    The Today Tendency of Sentiment Classification

    Get PDF
    Sentiment classification has already been studied for many years because it has had many crucial contributions to many different fields in everyday life, such as in political activities, commodity production, and commercial activities. There have been many kinds of the sentiment analysis such as machine learning approaches, lexicon-based approaches, etc., for many years. The today tendency of the sentiment classification is as follows: (1) Processing many big data sets with shortening execution times (2) Having a high accuracy (3) Integrating flexibly and easily into many small machines or many different approaches. We will present each category in more details

    Comparative Studies of Detecting Abusive Language on Twitter

    Full text link
    The context-dependent nature of online aggression makes annotating large collections of data extremely difficult. Previously studied datasets in abusive language detection have been insufficient in size to efficiently train deep learning models. Recently, Hate and Abusive Speech on Twitter, a dataset much greater in size and reliability, has been released. However, this dataset has not been comprehensively studied to its potential. In this paper, we conduct the first comparative study of various learning models on Hate and Abusive Speech on Twitter, and discuss the possibility of using additional features and context data for improvements. Experimental results show that bidirectional GRU networks trained on word-level features, with Latent Topic Clustering modules, is the most accurate model scoring 0.805 F1.Comment: ALW2: 2nd Workshop on Abusive Language Online to be held at EMNLP 2018 (Brussels, Belgium), October 31st, 201
    corecore