3,357 research outputs found

    Opinion Mining of Movie Review using Hybrid Method of Support Vector Machine and Particle Swarm Optimization

    Get PDF
    Nowadays, online social media is online discourse where people contribute to create content, share it, bookmark it, and network at an impressive rate. The faster message and ease of use in social media today is Twitter. The messages on Twitter include reviews and opinions on certain topics such as movie, book, product, politic, and so on. Based on this condition, this research attempts to use the messages of twitter to review a movie by using opinion mining or sentiment analysis. Opinion mining refers to the application of natural language processing, computational linguistics, and text mining to identify or classify whether the movie is good or not based on message opinion. Support Vector Machine (SVM) is supervised learning methods that analyze data and recognize the patterns that are used for classification. This research concerns on binary classification which is classified into two classes. Those classes are positive and negative. The positive class shows good message opinion; otherwise the negative class shows the bad message opinion of certain movies. This justification is based on the accuracy level of SVM with the validation process uses 10-Fold cross validation and confusion matrix. The hybrid Partical Swarm Optimization (PSO) is used to improve the election of best parameter in order to solve the dual optimization problem. The result shows the improvement of accuracy level from 71.87% to 77%

    New techniques for Arabic document classification

    Get PDF
    Text classification (TC) concerns automatically assigning a class (category) label to a text document, and has increasingly many applications, particularly in the domain of organizing, for browsing in large document collections. It is typically achieved via machine learning, where a model is built on the basis of a typically large collection of document features. Feature selection is critical in this process, since there are typically several thousand potential features (distinct words or terms). In text classification, feature selection aims to improve the computational e ciency and classification accuracy by removing irrelevant and redundant terms (features), while retaining features (words) that contain su cient information that help with the classification task. This thesis proposes binary particle swarm optimization (BPSO) hybridized with either K Nearest Neighbour (KNN) or Support Vector Machines (SVM) for feature selection in Arabic text classi cation tasks. Comparison between feature selection approaches is done on the basis of using the selected features in conjunction with SVM, Decision Trees (C4.5), and Naive Bayes (NB), to classify a hold out test set. Using publically available Arabic datasets, results show that BPSO/KNN and BPSO/SVM techniques are promising in this domain. The sets of selected features (words) are also analyzed to consider the di erences between the types of features that BPSO/KNN and BPSO/SVM tend to choose. This leads to speculation concerning the appropriate feature selection strategy, based on the relationship between the classes in the document categorization task at hand. The thesis also investigates the use of statistically extracted phrases of length two as terms in Arabic text classi cation. In comparison with Bag of Words text representation, results show that using phrases alone as terms in Arabic TC task decreases the classification accuracy of Arabic TC classifiers significantly while combining bag of words and phrase based representations may increase the classification accuracy of the SVM classifier slightly

    Hybrid ACO and SVM algorithm for pattern classification

    Get PDF
    Ant Colony Optimization (ACO) is a metaheuristic algorithm that can be used to solve a variety of combinatorial optimization problems. A new direction for ACO is to optimize continuous and mixed (discrete and continuous) variables. Support Vector Machine (SVM) is a pattern classification approach originated from statistical approaches. However, SVM suffers two main problems which include feature subset selection and parameter tuning. Most approaches related to tuning SVM parameters discretize the continuous value of the parameters which will give a negative effect on the classification performance. This study presents four algorithms for tuning the SVM parameters and selecting feature subset which improved SVM classification accuracy with smaller size of feature subset. This is achieved by performing the SVM parameters’ tuning and feature subset selection processes simultaneously. Hybridization algorithms between ACO and SVM techniques were proposed. The first two algorithms, ACOR-SVM and IACOR-SVM, tune the SVM parameters while the second two algorithms, ACOMV-R-SVM and IACOMV-R-SVM, tune the SVM parameters and select the feature subset simultaneously. Ten benchmark datasets from University of California, Irvine, were used in the experiments to validate the performance of the proposed algorithms. Experimental results obtained from the proposed algorithms are better when compared with other approaches in terms of classification accuracy and size of the feature subset. The average classification accuracies for the ACOR-SVM, IACOR-SVM, ACOMV-R and IACOMV-R algorithms are 94.73%, 95.86%, 97.37% and 98.1% respectively. The average size of feature subset is eight for the ACOR-SVM and IACOR-SVM algorithms and four for the ACOMV-R and IACOMV-R algorithms. This study contributes to a new direction for ACO that can deal with continuous and mixed-variable ACO

    Feature Selection Using Hybrid Binary Grey Wolf Optimizer for Arabic Text Classification

    Get PDF
    Feature selection in Arabic text is a challenging task due to the complex and rich nature of Arabic. The feature selection requires solution quality, stability, conver- gence speed, and the ability to find the global optimal. This study proposes a feature selection method using the Hybrid Binary Gray Wolf Optimizer (HBGWO) for Ara- bic text classification. The HBGWO method combines the local search capabilities or exploratory of the BGWO and the search capabilities around the best solutions or exploits of the PSO. HBGWO method also combines SCA’s capabilities in finding global solutions. The data set used Arabic text from islambook.com, which consists of five Hadith books. The books selected five classes: Tauhid, Prayer, Zakat, Fasting, and Hajj. The results showed that the BGWO-PSO-SCA feature selection method with the fitness function search and classification method using SVM could per- form better on Arabic text classification problems. BGWO-PSO with fitness function and the classification method using SVM (C=1.0) gives a high accuracy value of 76.37% compared to without feature selection. The BGWO-PSO-SCA feature selec- tion method provides an accuracy value of 88.08%. This accuracy value is higher than the BGWO-PSO feature selection and other feature selection methods

    Modeling, forecasting and trading the EUR exchange rates with hybrid rolling genetic algorithms: support vector regression forecast combinations

    Get PDF
    The motivation of this paper is to introduce a hybrid Rolling Genetic Algorithm-Support Vector Regression (RG-SVR) model for optimal parameter selection and feature subset combination. The algorithm is applied to the task of forecasting and trading the EUR/USD, EUR/GBP and EUR/JPY exchange rates. The proposed methodology genetically searches over a feature space (pool of individual forecasts) and then combines the optimal feature subsets (SVR forecast combinations) for each exchange rate. This is achieved by applying a fitness function specialized for financial purposes and adopting a sliding window approach. The individual forecasts are derived from several linear and non-linear models. RG-SVR is benchmarked against genetically and non-genetically optimized SVRs and SVMs models that are dominating the relevant literature, along with the robust ARBF-PSO neural network. The statistical and trading performance of all models is investigated during the period of 1999–2012. As it turns out, RG-SVR presents the best performance in terms of statistical accuracy and trading efficiency for all the exchange rates under study. This superiority confirms the success of the implemented fitness function and training procedure, while it validates the benefits of the proposed algorithm
    • …
    corecore