Search CORE

13 research outputs found

A comparison of the effect of feature selection and balancing strategies upon the sentiment classification of portuguese news stories

Author: Drury Brett Mylo
Lopes Alneu de Andrade
Publication venue: São Carlos
Publication date
Field of study

Sentiment classification of news stories using supervised learning is a mature task in the field of Natural Language Processing. Supervised learning strategies rely upon training data to induce a classifier. Training data can be imbalanced, with typically the neutral class being the majority class. This imbalance can bias the induced classifier towards the majority class. Balancing and feature selection can mitigate the effects of imbalanced data. This paper surveys a number of common balancing and\ud feature selections techniques, and applies them to an imbalanced data set of manually labelled Brazilian agricultural news stories. The strategies were appraised with a 90:10 holdout evaluation and compared with a baseline strategy. We found that: 1. the feature selection strategies provided no identifiable advantage over a baseline method and 2. balancing produced an advantage over baseline with random oversampling producing the best results.FAPESP (grant 11/20451-1

RCAAP - Repositório Científico de Acesso Aberto de Portugal

Universidade de São Paulo

Review of Feature Selection and Optimization Strategies in Opinion Mining

Author: K.Venkata Rama Rao
Publication venue: Global Journals Inc. (US)
Publication date: 15/10/2016
Field of study

Opinion mining and sentiment analysis methods has become a prerogative models in terms of gaining insights from the huge volume of data that is being generated from vivid sources. There are vivid range of data that is being generated from varied sources. If such veracity and variety of data can be explored in terms of evaluating the opinion mining process, it could help the target groups in getting the public pulse which could support them in taking informed decisions. Though the process of opinion mining and sentiment analysis has been one of the hot topics focused upon by the researchers, the process has not been completely revolutionary. In this study the focus has been upon reviewing varied range of models and solutions that are proposed for sentiment analysis and opinion mining. From the vivid range of inputs that are gathered and the detailed study that is carried out, it is evident that the current models are still in complex terms of evaluation and result fetching, due to constraints like comprehensive knowledge and natural language limitation factors. As a futuristic model in the domain, the process of adapting scope of evolutionary computational methods and adapting hybridization of such methods for feature extraction as an idea is tossed in this paper

Global Journal of Computer Science and Technology (GJCST)

Review of Feature Selection and Optimization Strategies in Opinion Mining

Author: K.Venkata Rama Rao
Publication venue: Global Journals Inc. (US)
Publication date: 15/10/2016
Field of study

Global Journal of Computer Science and Technology (GJCST)

Review of Feature Selection and Optimization Strategies in Opinion Mining

Author: K.Venkata Rama Rao
Publication venue: Global Journals Inc. (US)
Publication date: 15/10/2016
Field of study

Global Journal of Computer Science and Technology (GJCST)

On the Feature Selection and Classification Based on Information Gain for Document Sentiment Analysis

Author: Adiwijaya
Asriyanti Indah Pratiwi
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2018
Field of study

Sentiment analysis in a movie review is the needs of today lifestyle. Unfortunately, enormous features make the sentiment of analysis slow and less sensitive. Finding the optimum feature selection and classification is still a challenge. In order to handle an enormous number of features and provide better sentiment classification, an information-based feature selection and classification are proposed. The proposed method reduces more than 90% unnecessary features while the proposed classification scheme achieves 96% accuracy of sentiment classification. From the experimental results, it can be concluded that the combination of proposed feature selection and classification achieves the best performance so far

Crossref

Directory of Open Access Journals

Review of Feature Selection and Optimization Strategies in Opinion Mining

Author: Rama Rao K.Venkata
Publication venue: Global Journals Inc. (US)
Publication date: 22/04/2016
Field of study

Global Journal of Computer Science and Technology (GJCST)

Feature extraction in opinion mining through Persian reviews

Author: E. Golpar-Rabooki
J. Rezaeenour
S. Zarghamifar
Publication venue: 'International Digital Organization for Scientific Information (IDOSI)'
Publication date: 01/10/2015
Field of study

Opinion mining deals with an analysis of user reviews for extracting their opinions, sentiments and demands in a specific area, which can play an important role in making major decisions in such area. In general, opinion mining extracts user reviews at three levels of document, sentence and feature. Opinion mining at the feature level is taken into consideration more than the other two levels due to orientation analysis of different aspects of an area. In this paper, two methods are introduced for feature extraction. The recommended methods consist of four main stages. At the first stage, opinion-mining lexicon for Persian is created. This lexicon is used to determine the orientation of users’ reviews. The second one is the preprocessing stage including unification of writing, tokenization, creating parts-of-speech tagging and syntactic dependency parsing for documents. The third stage involves the extraction of features using two methods including frequency-based feature extraction and association rule based feature extraction. In the fourth stage, the features and polarities of the word reviews extracted in the previous stage are modified and the final features' polarity is determined. To assess the suggested techniques, a set of user reviews in both scopes of university and cell phone areas were collected and the results of the two methods were compared

Directory of Open Access Journals

Recognizing contextual valence shifters in document-level sentiment classification

Author: Morsy Sara Ahmed
Publication venue: AUC Knowledge Fountain
Publication date: 01/02/2012
Field of study

Sentiment classification is an emerging research field. Due to the rich opinionated web content, people and organizations are interested in knowing others\u27 opinions, so they need an automated tool for analyzing and summarizing these opinions. One of the major tasks of sentiment classification is to classify a document (i.e. a blog, news article or review) as holding an overall positive or negative sentiment. Machine learning approaches have succeeded in achieving better results than semantic orientation approaches in document-level sentiment classification; however, they still need to take linguistic context into account, by making use of the so-called contextual valence shifters. Early research has tried to add sentiment features and contextual valence shifters to the machine learning approach to tackle this problem, but the classifier\u27s performance was low.In this study, we would like to improve the performance of document-level sentiment classification using the machine learning approach by proposing new feature sets that refine the traditional sentiment feature extraction method and take contextual valence shifters into consideration from a different perspective than the earlier research. These feature sets include: 1) a feature set consisting of 16 features for counting different categories of contextual valence shifters (intensifiers, negators and polarity shifters) as well as the frequency of words grouped according to their final (modified) polarity; and 2) another feature set consisting of the frequency of each sentiment word after modifying its prior polarity. We performed several experiments to: 1) compare our proposed feature sets with the traditional sentiment features that count the frequency of each sentiment word while disregarding its prior polarity; 2) compare our proposed feature sets after combining them with stylistic features and n-grams with traditional sentiment features combined with stylistic features and n-grams; and 3) evaluate the effectiveness of our proposed feature sets against stylistic features and n-grams by performing feature selection. The results of all the experiments show a significant improvement over the baselines, in terms of the accuracy, precision and recall, which indicate that our proposed feature sets are effective in document-level sentiment classification

AUC Knowledge Fountain (American Univ. in Cairo)

Recommended from our members

Sentiment analysis of financial Twitter posts on Twitter with the machine learning classifiers

Author: Ahmed Sana
Cam Alper Veli
Cam Handan
Demirel Ugur
Publication venue: Elsevier
Publication date: 15/01/2024
Field of study

This paper presents a sentiment analysis combining the lexicon-based and machine learning (ML)-based approaches in Turkish to investigate the public mood for the prediction of stock market behavior in BIST30, Borsa Istanbul. Our main motivation behind this study is to apply sentiment analysis to financial-related tweets in Turkish. We import 17189 tweets posted as "#Borsaistanbul, #Bist, #Bist30, #Bist100″ on Twitter between November 7, 2022, and November 15, 2022, via a MAXQDA 2020, a qualitative data analysis program. For the lexicon-based side, we use a multilingual sentiment offered by the Orange program to label the polarities of the 17189 samples as positive, negative, and neutral labels. Neutral labels are discarded for the machine learning experiments. For the machine learning side, we select 9076 data as positive and negative to implement the classification problem with six different supervised machine learning classifiers conducted in Python 3.6 with the sklearn library. In experiments, 80 % of the selected data is used for the training phase and the rest is used for the testing and validation phase. Results of the experiments show that the Support Vector Machine and Multilayer Perceptron classifier perform better than other classifiers with 0.89 and 0.88 accuracy and AUC values of 0.8729 and 0.8647 respectively. Other classifiers obtain approximately a 78,5 % accuracy rate. It is possible to increase sentiment analysis accuracy with parameter optimization on a larger, cleaner, and more balanced dataset by changing the pre-processing steps. This work can be expanded in the future to develop better sentiment analysis using deep learning approaches

Central Archive at the University of Reading