3,357 research outputs found
Opinion Mining of Movie Review using Hybrid Method of Support Vector Machine and Particle Swarm Optimization
Nowadays, online social media is online
discourse where people contribute to create content, share
it, bookmark it, and network at an impressive rate. The
faster message and ease of use in social media today is
Twitter. The messages on Twitter include reviews and
opinions on certain topics such as movie, book, product,
politic, and so on. Based on this condition, this research
attempts to use the messages of twitter to review a movie by
using opinion mining or sentiment analysis. Opinion mining
refers to the application of natural language processing,
computational linguistics, and text mining to identify or
classify whether the movie is good or not based on message
opinion. Support Vector Machine (SVM) is supervised
learning methods that analyze data and recognize the
patterns that are used for classification. This research
concerns on binary classification which is classified into two
classes. Those classes are positive and negative. The positive
class shows good message opinion; otherwise the negative
class shows the bad message opinion of certain movies. This
justification is based on the accuracy level of SVM with the
validation process uses 10-Fold cross validation and
confusion matrix. The hybrid Partical Swarm Optimization
(PSO) is used to improve the election of best parameter in
order to solve the dual optimization problem. The result
shows the improvement of accuracy level from 71.87% to
77%
New techniques for Arabic document classification
Text classification (TC) concerns automatically assigning a class (category) label to
a text document, and has increasingly many applications, particularly in the domain
of organizing, for browsing in large document collections. It is typically achieved
via machine learning, where a model is built on the basis of a typically large collection
of document features. Feature selection is critical in this process, since there
are typically several thousand potential features (distinct words or terms). In text
classification, feature selection aims to improve the computational e ciency and
classification accuracy by removing irrelevant and redundant terms (features), while
retaining features (words) that contain su cient information that help with the
classification task.
This thesis proposes binary particle swarm optimization (BPSO) hybridized with
either K Nearest Neighbour (KNN) or Support Vector Machines (SVM) for feature
selection in Arabic text classi cation tasks. Comparison between feature selection
approaches is done on the basis of using the selected features in conjunction with
SVM, Decision Trees (C4.5), and Naive Bayes (NB), to classify a hold out test
set. Using publically available Arabic datasets, results show that BPSO/KNN and
BPSO/SVM techniques are promising in this domain. The sets of selected features
(words) are also analyzed to consider the di erences between the types of features
that BPSO/KNN and BPSO/SVM tend to choose. This leads to speculation concerning
the appropriate feature selection strategy, based on the relationship between
the classes in the document categorization task at hand.
The thesis also investigates the use of statistically extracted phrases of length
two as terms in Arabic text classi cation. In comparison with Bag of Words text
representation, results show that using phrases alone as terms in Arabic TC task
decreases the classification accuracy of Arabic TC classifiers significantly while combining
bag of words and phrase based representations may increase the classification
accuracy of the SVM classifier slightly
Hybrid ACO and SVM algorithm for pattern classification
Ant Colony Optimization (ACO) is a metaheuristic algorithm that can be used to
solve a variety of combinatorial optimization problems. A new direction for ACO is to optimize continuous and mixed (discrete and continuous) variables. Support Vector Machine (SVM) is a pattern classification approach originated from statistical approaches. However, SVM suffers two main problems which include feature subset selection and parameter tuning. Most approaches related to tuning SVM parameters discretize the continuous value of the parameters which will give a negative effect on the classification performance. This study presents four algorithms for tuning the
SVM parameters and selecting feature subset which improved SVM classification accuracy with smaller size of feature subset. This is achieved by performing the SVM parameters’ tuning and feature subset selection processes simultaneously. Hybridization algorithms between ACO and SVM techniques were proposed. The first two algorithms, ACOR-SVM and IACOR-SVM, tune the SVM parameters while
the second two algorithms, ACOMV-R-SVM and IACOMV-R-SVM, tune the SVM parameters and select the feature subset simultaneously. Ten benchmark datasets from University of California, Irvine, were used in the experiments to validate the performance of the proposed algorithms. Experimental results obtained from the proposed algorithms are better when compared with other approaches in terms of classification accuracy and size of the feature subset. The average classification
accuracies for the ACOR-SVM, IACOR-SVM, ACOMV-R and IACOMV-R algorithms are 94.73%, 95.86%, 97.37% and 98.1% respectively. The average size of feature subset is eight for the ACOR-SVM and IACOR-SVM algorithms and four for the ACOMV-R and IACOMV-R algorithms. This study contributes to a new direction for ACO that can deal with continuous and mixed-variable ACO
Feature Selection Using Hybrid Binary Grey Wolf Optimizer for Arabic Text Classification
Feature selection in Arabic text is a challenging task due to the complex and rich nature of Arabic. The feature selection requires solution quality, stability, conver- gence speed, and the ability to find the global optimal. This study proposes a feature selection method using the Hybrid Binary Gray Wolf Optimizer (HBGWO) for Ara- bic text classification. The HBGWO method combines the local search capabilities or exploratory of the BGWO and the search capabilities around the best solutions or exploits of the PSO. HBGWO method also combines SCA’s capabilities in finding global solutions. The data set used Arabic text from islambook.com, which consists of five Hadith books. The books selected five classes: Tauhid, Prayer, Zakat, Fasting, and Hajj. The results showed that the BGWO-PSO-SCA feature selection method with the fitness function search and classification method using SVM could per- form better on Arabic text classification problems. BGWO-PSO with fitness function and the classification method using SVM (C=1.0) gives a high accuracy value of 76.37% compared to without feature selection. The BGWO-PSO-SCA feature selec- tion method provides an accuracy value of 88.08%. This accuracy value is higher than the BGWO-PSO feature selection and other feature selection methods
Modeling, forecasting and trading the EUR exchange rates with hybrid rolling genetic algorithms: support vector regression forecast combinations
The motivation of this paper is to introduce a hybrid Rolling Genetic Algorithm-Support Vector Regression (RG-SVR) model for optimal parameter selection and feature subset combination. The algorithm is applied to the task of forecasting and trading the EUR/USD, EUR/GBP and EUR/JPY exchange rates. The proposed methodology genetically searches over a feature space (pool of individual forecasts) and then combines the optimal feature subsets (SVR forecast combinations) for each exchange rate. This is achieved by applying a fitness function specialized for financial purposes and adopting a sliding window approach. The individual forecasts are derived from several linear and non-linear models. RG-SVR is benchmarked against genetically and non-genetically optimized SVRs and SVMs models that are dominating the relevant literature, along with the robust ARBF-PSO neural network. The statistical and trading performance of all models is investigated during the period of 1999–2012. As it turns out, RG-SVR presents the best performance in terms of statistical accuracy and trading efficiency for all the exchange rates under study. This superiority confirms the success of the implemented fitness function and training procedure, while it validates the benefits of the proposed algorithm
- …