13 research outputs found

    A comparison of the effect of feature selection and balancing strategies upon the sentiment classification of portuguese news stories

    Get PDF
    Sentiment classification of news stories using supervised learning is a mature task in the field of Natural Language Processing. Supervised learning strategies rely upon training data to induce a classifier. Training data can be imbalanced, with typically the neutral class being the majority class. This imbalance can bias the induced classifier towards the majority class. Balancing and feature selection can mitigate the effects of imbalanced data. This paper surveys a number of common balancing and\ud feature selections techniques, and applies them to an imbalanced data set of manually labelled Brazilian agricultural news stories. The strategies were appraised with a 90:10 holdout evaluation and compared with a baseline strategy. We found that: 1. the feature selection strategies provided no identifiable advantage over a baseline method and 2. balancing produced an advantage over baseline with random oversampling producing the best results.FAPESP (grant 11/20451-1

    Review of Feature Selection and Optimization Strategies in Opinion Mining

    Get PDF
    Opinion mining and sentiment analysis methods has become a prerogative models in terms of gaining insights from the huge volume of data that is being generated from vivid sources. There are vivid range of data that is being generated from varied sources. If such veracity and variety of data can be explored in terms of evaluating the opinion mining process, it could help the target groups in getting the public pulse which could support them in taking informed decisions. Though the process of opinion mining and sentiment analysis has been one of the hot topics focused upon by the researchers, the process has not been completely revolutionary. In this study the focus has been upon reviewing varied range of models and solutions that are proposed for sentiment analysis and opinion mining. From the vivid range of inputs that are gathered and the detailed study that is carried out, it is evident that the current models are still in complex terms of evaluation and result fetching, due to constraints like comprehensive knowledge and natural language limitation factors. As a futuristic model in the domain, the process of adapting scope of evolutionary computational methods and adapting hybridization of such methods for feature extraction as an idea is tossed in this paper

    Review of Feature Selection and Optimization Strategies in Opinion Mining

    Get PDF
    Opinion mining and sentiment analysis methods has become a prerogative models in terms of gaining insights from the huge volume of data that is being generated from vivid sources. There are vivid range of data that is being generated from varied sources. If such veracity and variety of data can be explored in terms of evaluating the opinion mining process, it could help the target groups in getting the public pulse which could support them in taking informed decisions. Though the process of opinion mining and sentiment analysis has been one of the hot topics focused upon by the researchers, the process has not been completely revolutionary. In this study the focus has been upon reviewing varied range of models and solutions that are proposed for sentiment analysis and opinion mining. From the vivid range of inputs that are gathered and the detailed study that is carried out, it is evident that the current models are still in complex terms of evaluation and result fetching, due to constraints like comprehensive knowledge and natural language limitation factors. As a futuristic model in the domain, the process of adapting scope of evolutionary computational methods and adapting hybridization of such methods for feature extraction as an idea is tossed in this paper

    Review of Feature Selection and Optimization Strategies in Opinion Mining

    Get PDF
    Opinion mining and sentiment analysis methods has become a prerogative models in terms of gaining insights from the huge volume of data that is being generated from vivid sources. There are vivid range of data that is being generated from varied sources. If such veracity and variety of data can be explored in terms of evaluating the opinion mining process, it could help the target groups in getting the public pulse which could support them in taking informed decisions. Though the process of opinion mining and sentiment analysis has been one of the hot topics focused upon by the researchers, the process has not been completely revolutionary. In this study the focus has been upon reviewing varied range of models and solutions that are proposed for sentiment analysis and opinion mining. From the vivid range of inputs that are gathered and the detailed study that is carried out, it is evident that the current models are still in complex terms of evaluation and result fetching, due to constraints like comprehensive knowledge and natural language limitation factors. As a futuristic model in the domain, the process of adapting scope of evolutionary computational methods and adapting hybridization of such methods for feature extraction as an idea is tossed in this paper

    On the Feature Selection and Classification Based on Information Gain for Document Sentiment Analysis

    Get PDF
    Sentiment analysis in a movie review is the needs of today lifestyle. Unfortunately, enormous features make the sentiment of analysis slow and less sensitive. Finding the optimum feature selection and classification is still a challenge. In order to handle an enormous number of features and provide better sentiment classification, an information-based feature selection and classification are proposed. The proposed method reduces more than 90% unnecessary features while the proposed classification scheme achieves 96% accuracy of sentiment classification. From the experimental results, it can be concluded that the combination of proposed feature selection and classification achieves the best performance so far

    Review of Feature Selection and Optimization Strategies in Opinion Mining

    Get PDF
    Opinion mining and sentiment analysis methods has become a prerogative models in terms of gaining insights from the huge volume of data that is being generated from vivid sources. There are vivid range of data that is being generated from varied sources. If such veracity and variety of data can be explored in terms of evaluating the opinion mining process, it could help the target groups in getting the public pulse which could support them in taking informed decisions. Though the process of opinion mining and sentiment analysis has been one of the hot topics focused upon by the researchers, the process has not been completely revolutionary. In this study the focus has been upon reviewing varied range of models and solutions that are proposed for sentiment analysis and opinion mining. From the vivid range of inputs that are gathered and the detailed study that is carried out, it is evident that the current models are still in complex terms of evaluation and result fetching, due to constraints like comprehensive knowledge and natural language limitation factors. As a futuristic model in the domain, the process of adapting scope of evolutionary computational methods and adapting hybridization of such methods for feature extraction as an idea is tossed in this paper

    Feature extraction in opinion mining through Persian reviews

    Get PDF
    Opinion mining deals with an analysis of user reviews for extracting their opinions, sentiments and demands in a specific area, which can play an important role in making major decisions in such area. In general, opinion mining extracts user reviews at three levels of document, sentence and feature. Opinion mining at the feature level is taken into consideration more than the other two levels due to orientation analysis of different aspects of an area. In this paper, two methods are introduced for feature extraction. The recommended methods consist of four main stages. At the first stage, opinion-mining lexicon for Persian is created. This lexicon is used to determine the orientation of users’ reviews. The second one is the preprocessing stage including unification of writing, tokenization, creating parts-of-speech tagging and syntactic dependency parsing for documents. The third stage involves the extraction of features using two methods including frequency-based feature extraction and association rule based feature extraction. In the fourth stage, the features and polarities of the word reviews extracted in the previous stage are modified and the final features' polarity is determined. To assess the suggested techniques, a set of user reviews in both scopes of university and cell phone areas were collected and the results of the two methods were compared

    Recognizing contextual valence shifters in document-level sentiment classification

    Get PDF
    Sentiment classification is an emerging research field. Due to the rich opinionated web content, people and organizations are interested in knowing others\u27 opinions, so they need an automated tool for analyzing and summarizing these opinions. One of the major tasks of sentiment classification is to classify a document (i.e. a blog, news article or review) as holding an overall positive or negative sentiment. Machine learning approaches have succeeded in achieving better results than semantic orientation approaches in document-level sentiment classification; however, they still need to take linguistic context into account, by making use of the so-called contextual valence shifters. Early research has tried to add sentiment features and contextual valence shifters to the machine learning approach to tackle this problem, but the classifier\u27s performance was low.In this study, we would like to improve the performance of document-level sentiment classification using the machine learning approach by proposing new feature sets that refine the traditional sentiment feature extraction method and take contextual valence shifters into consideration from a different perspective than the earlier research. These feature sets include: 1) a feature set consisting of 16 features for counting different categories of contextual valence shifters (intensifiers, negators and polarity shifters) as well as the frequency of words grouped according to their final (modified) polarity; and 2) another feature set consisting of the frequency of each sentiment word after modifying its prior polarity. We performed several experiments to: 1) compare our proposed feature sets with the traditional sentiment features that count the frequency of each sentiment word while disregarding its prior polarity; 2) compare our proposed feature sets after combining them with stylistic features and n-grams with traditional sentiment features combined with stylistic features and n-grams; and 3) evaluate the effectiveness of our proposed feature sets against stylistic features and n-grams by performing feature selection. The results of all the experiments show a significant improvement over the baselines, in terms of the accuracy, precision and recall, which indicate that our proposed feature sets are effective in document-level sentiment classification