4,022 research outputs found
Tackling Ant Colony Optimization Meta-Heuristic as Search Method in Feature Subset Selection Based on Correlation or Consistency Measures
This paper introduces the use of an ant colony optimization
(ACO) algorithm, called Ant System, as a search method in two wellknown
feature subset selection methods based on correlation or consistency
measures such as CFS (Correlation-based Feature Selection) and
CNS (Consistency-based Feature Selection). ACO guides the search using
a heuristic evaluator. Empirical results on twelve real-world classification
problems are reported. Statistical tests have revealed that InfoGain is a
very suitable heuristic for CFS or CNS feature subset selection methods
with ACO acting as search method. The use of InfoGain is shown to be
the significantly better heuristic over a range of classifiers. The results
achieved by means of ACO-based feature subset selection with the suitable
heuristic evaluator are better for most of the problems comparing
with those obtained with CFS or CNS combined with Best First search.MICYT TIN2007-68084- C02-02MICYT TIN2011-28956-C02-02Junta de Andalucía P11-TIC-752
Formulating new enhanced pattern classification algorithms based on ACO-SVM
This paper presents two algorithms that integrate new Ant Colony Optimization (ACO) variants which are Incremental
Continuous Ant Colony Optimization (IACOR) and Incremental
Mixed Variable Ant Colony Optimization (IACOMV) with Support
Vector Machine (SVM) to enhance the performance of SVM.The
first algorithm aims to solve SVM model selection problem. ACO originally deals with discrete optimization problem.In applying ACO for solving SVM model selection problem which are continuous variables, there is a need to discretize the continuously value into discrete values.This discretization process would result in loss of some information and hence affects the classification accuracy and seeking time.In this algorithm we propose to solve SVM model selection problem using IACOR without the need to discretize
continuous value for SVM.The second algorithm aims to simultaneously solve SVM model selection problem and selects a small number of features.SVM model selection and selection of suitable and small number of feature subsets must occur simultaneously because error produced from the feature subset selection phase will affect the values of SVM model selection and result in low classification accuracy.In this second algorithm we propose the use of IACOMV to simultaneously solve SVM model selection problem and features subset selection.Ten benchmark datasets were used to evaluate the proposed algorithms.Results showed that the proposed algorithms can enhance the classification
accuracy with small size of features subset
Feature Selection for Document Classification : Case Study of Meta-heuristic Intelligence and Traditional Approaches
Doctor of Philosophy (Computer Engineering), 2020Nowadays, the culture for accessing news around the world is changed from paper to electronic format and the rate of publication for newspapers and magazines on website are increased dramatically. Meanwhile, text feature selection for the automatic document classification (ADC) is becoming a big challenge because of the unstructured nature of text feature, which is called “multi-dimension feature problem”. On the other hand, various powerful schemes dealing with text feature selection are being developed continuously nowadays, but there still exists a research gap for “optimization of feature selection problem (OFSP)”, which can be looked for the global optimal features. Meanwhile, the capacity of meta-heuristic intelligence for knowledge discovery process (KDP) is also become the critical role to overcome NP-hard problem of OFSP by providing effective performance and efficient computation time. Therefore, the idea of meta-heuristic based approach for optimization of feature selection is proposed in this research to search the global optimal features for ADC.
In this thesis, case study of meta-heuristic intelligence and traditional approaches for feature selection optimization process in document classification is observed. It includes eleven meta-heuristic algorithms such as Ant Colony search, Artificial Bee Colony search, Bat search, Cuckoo search, Evolutionary search, Elephant search, Firefly search, Flower search, Genetic search, Rhinoceros search, and Wolf search, for searching the optimal feature subset for document classification. Then, the results of proposed model are compared with three traditional search algorithms like Best First search (BFS), Greedy Stepwise (GS), and Ranker search (RS). In addition, the framework of data mining is applied. It involves data preprocessing, feature engineering, building learning model and evaluating the performance of proposed meta-heuristic intelligence-based feature selection using various performance and computation complexity evaluation schemes. In data processing, tokenization, stop-words handling, stemming and lemmatizing, and normalization are applied. In feature engineering process, n-gram TF-IDF feature extraction is used for implementing feature vector and both filter and wrapper approach are applied for observing different cases. In addition, three different classifiers like J48, Naïve Bayes, and Support Vector Machine, are used for building the document classification model. According to the results, the proposed system can reduce the number of selected features dramatically that can deteriorate learning model performance. In addition, the selected global subset features can yield better performance than traditional search according to single objective function of proposed model
Learning Multi-Tree Classification Models with Ant Colony Optimization
Ant Colony Optimization (ACO) is a meta-heuristic for solving combinatorial optimization problems, inspired by the behaviour of biological ant colonies. One of the successful applications of ACO is learning classification models (classifiers). A classifier encodes the relationships between the input attribute values and the values of a class attribute in a given set of labelled cases and it can be used to predict the class value of new unlabelled cases. Decision trees have been widely used as a type of classification model that represent comprehensible knowledge to the user. In this paper, we propose the use of ACO-based algorithms for learning an extended multi-tree classification model, which consists of multiple decision trees, one for each class value. Each class-based decision trees is responsible for discriminating between its class value and all other values available in the class domain. Our proposed algorithms are empirically evaluated against well-known decision trees induction algorithms, as well as the ACO-based Ant-Tree-Miner algorithm. The results show an overall improvement in predictive accuracy over 32 benchmark datasets. We also discuss how the new multi-tree models can provide the user with more understanding and knowledge-interpretability in a given domain
- …