172 research outputs found

    Improving the Interpretability of Classification Rules Discovered by an Ant Colony Algorithm: Extended Results

    Get PDF
    The vast majority of Ant Colony Optimization (ACO) algorithms for inducing classification rules use an ACO-based procedure to create a rule in an one-at-a-time fashion. An improved search strategy has been proposed in the cAnt-MinerPB algorithm, where an ACO-based procedure is used to create a complete list of rules (ordered rules)-i.e., the ACO search is guided by the quality of a list of rules, instead of an individual rule. In this paper we propose an extension of the cAnt-MinerPB algorithm to discover a set of rules (unordered rules). The main motivations for this work are to improve the interpretation of individual rules by discovering a set of rules and to evaluate the impact on the predictive accuracy of the algorithm. We also propose a new measure to evaluate the interpretability of the discovered rules to mitigate the fact that the commonly-used model size measure ignores how the rules are used to make a class prediction. Comparisons with state-of-the-art rule induction algorithms, support vector machines and the cAnt-MinerPB producing ordered rules are also presented

    Extensions to the ant-miner classification rule discovery algorithm

    Get PDF
    Ant-Miner is an application of ACO in data mining. It has been introduced by Parpinelli et al. in 2002 as an ant-based algorithm for the discovery of classification rules. Ant-Miner has proved to be a very promising technique for classification rules discovery. Ant-Miner generates a fewer number of rules, fewer terms per each rule and performs competitively in terms of efficiency compared to the C4.5 algorithm (see experimental results in [20]). Hence, it has been a focus area of research and a lot of modification has been done to it in order to increase its quality in terms of classification accuracy and output rules comprehensibility (reducing the size of the rule set). The thesis proposes five extensions to Ant-Miner. 1) The thesis proposes the use of a logical negation operator in the antecedents of constructed rules, so the terms in the rule antecedents could be in the form of . This tends to generate rules with higher coverage and reduce the size of the generated rule set. 2) The thesis proposes the use stubborn ants, an ACO-variation in which an ant is allowed to take into consideration its own personal past history. Stubborn ants tend to generate rules with higher classification accuracy in fewer trials per iteration. 3) The thesis proposes the use multiple types of pheromone; one for each permitted rule class, i.e. an ant would first select the rule class and then deposit the corresponding type of pheromone. The multi-pheromone system improves the quality of the output in terms of classification accuracy as well as it comprehensibility. 4) Along with the multi-pheromone system, the thesis proposes a new pheromone update strategy, called quality contrast intensifier. Such a strategy rewards rules with high confidence by depositing more pheromone and penalizes rules with low confidence by removing pheromone. 5) The thesis proposes that each ant to have its own value of α and β parameters, which in a sense means that each ant has its own individual personality. In order to verify the efficiency of these modifications, several cross-validation experiments have been applied on each of eight datasets used in the experiment. Average output results have been recorded, and a test of statistical significance has been applied to indicate improvement significance. Empirical results show improvements in the algorithm\u27s performance in terms of the simplicity of the generated rule set, the number of trials, and the predictive accuracy

    A Hybrid of Ant Colony Optimization Algorithm and Simulated Annealing for Classification Rules

    Get PDF
    Ant colony optimization (ACO) is a metaheuristic approach inspired from the behaviour of natural ants and can be used to solve a variety of combinatorial optimization problems. Classification rule induction is one of the problems solved by the Ant-miner algorithm, a variant of ACO, which was initiated by Parpinelli in 2001. Previous studies have shown that ACO is a promising machine learning technique to generate classification rules. However, the Ant-miner is less class focused since the rule’s class is assigned after the rule was constructed. There is also the case where the Ant-miner cannot find any optimal solution for some data sets. Thus, this thesis proposed two variants of hybrid ACO with simulated annealing (SA) algorithm for solving problem of classification rule induction. In the first proposed algorithm, SA is used to optimize the rule's discovery activity by an ant. Benchmark data sets from various fields were used to test the proposed algorithms. Experimental results obtained from this proposed algorithm are comparable to the results of the Ant-miner and other well-known rule induction algorithms in terms of rule accuracy, but are better in terms of rule simplicity. The second proposed algorithm uses SA to optimize the terms selection while constructing a rule. The algorithm fixes the class before rule's construction. Since the algorithm fixed the class before each rule's construction, a much simpler heuristic and fitness function is proposed. Experimental results obtained from the proposed algorithm are much higher than other compared algorithms, in terms of predictive accuracy. The successful work on hybridization of ACO and SA algorithms has led to the improved learning ability of ACO for classification. Thus, a higher predictive power classification model for various fields could be generated

    Adaptive Parameter Control Strategy for Ant-Miner Classification Algorithm

    Get PDF
    Pruning is the popular framework for preventing the dilemma of overfitting noisy data. This paper presents a new hybrid Ant-Miner classification algorithm and ant colony system (ACS), called ACS-AntMiner. A key aspect of this algorithm is the selection of an appropriate number of terms to be included in the classification rule. ACS-AntMiner introduces a new parameter called importance rate (IR) which is a pre-pruning criterion based on the probability (heuristic and pheromone) amount. This criterion is responsible for adding only the important terms to each rule, thus discarding noisy data. The ACS algorithm is designed to optimize the IR parameter during the learning process of the Ant-Miner algorithm. The performance of the proposed classifier is compared with related ant-mining classifiers, namely, Ant-Miner, CAnt-Miner, TACO-Miner, and Ant-Miner with a hybrid pruner across several datasets. Experimental results show that the proposed classifier significantly outperforms the other ant-mining classifiers

    New Archive-Based Ant Colony Optimization Algorithms for Learning Predictive Rules from Data

    Get PDF
    Data mining is the process of extracting knowledge and patterns from data. Classification and Regression are among the major data mining tasks, where the goal is to predict a value of an attribute of interest for each data instance, given the values of a set of predictive attributes. Most classification and regression problems involve continuous, ordinal and categorical attributes. Currently Ant Colony Optimization (ACO) algorithms have focused on directly handling categorical attributes only; continuous attributes are transformed using a discretisation procedure in either a preprocessing stage or dynamically during the rule creation. The use of a discretisation procedure has several limitations: (i) it increases the computational runtime, since several candidates values need to evaluated; (ii) requires access to the entire attribute domain, which in some applications all data is not available; (iii) the values used to create discrete intervals are not optimised in combination with the values of other attributes. This thesis investigates the use of solution archive pheromone model, based on Ant Colony Optimization for mixed-variable (ACOMV) algorithm, to directly cope with all attribute types. Firstly, an archive-based ACO classification algorithm is presented, followed by an automatic design framework to generate new configuration of ACO algorithms. Then, we addressed the challenging problem of mining data streams, presenting a new ACO algorithm in combination with a hybrid pheromone model. Finally, the archive-based approach is extended to cope with regression problems. All algorithms presented are compared against well-known algorithms from the literature using publicly available data sets. Our results have been shown to improve the computational time while maintaining a competitive predictive performance

    Ant colony optimization for rule induction with simulated annealing for terms selection

    Get PDF
    This paper proposes a sequential covering based algorithm that uses an ant colony optimization algorithm to directly extract classification rules from the data set.The proposed algorithm uses a Simulated Annealing algorithm to optimize terms selection, while growing a rule.The proposed algorithm minimizes the problem of a low quality discovered rule by an ant in a colony, where the rule discovered by an ant is not the best quality rule, by optimizing the terms selection in rule construction. Seventeen data sets which consist of discrete and continuous data from a UCI repository are used to evaluate the performance of the proposed algorithm.Promising results are obtained when compared to the Ant-Miner algorithm and PART algorithm in terms of average predictive accuracy of the discovered classification rules

    Ant colony optimization algorithm for rule based classification: Issues and potential

    Get PDF
    Classification rule discovery using ant colony optimization (ACO) imitates the foraging behavior of real ant colonies. It is considered as one of the successful swarm intelligence metaheuristics for data classification. ACO has gained importance because of its stochastic feature and iterative adaptation procedure based on positive feedback, both of which allow for the exploration of a large area of the search space. Nevertheless, ACO also has several drawbacks that may reduce the classification accuracy and the computational time of the algorithm. This paper presents a review of related work of ACO rule classification which emphasizes the types of ACO algorithms and issues. Potential solutions that may be considered to improve the performance of ACO algorithms in the classification domain were also presented. Furthermore, this review can be used as a source of reference to other researchers in developing new ACO algorithms for rule classification

    Improving the hierarchical classification of protein functions With swarm intelligence

    Get PDF
    This thesis investigates methods to improve the performance of hierarchical classification. In terms of this thesis hierarchical classification is a form of supervised learning, where the classes in a data set are arranged in a tree structure. As a base for our new methods we use the TDDC (top-down divide-and-conquer) approach for hierarchical classification, where each classifier is built only to discriminate between sibling classes. Firstly, we propose a swarm intelligence technique which varies the types of classifiers used at each divide within the TDDC tree. Our technique, PSO/ACO-CS (Particle Swarm Optimisation/Ant Colony Optimisation Classifier Selection), finds combinations of classifiers to be used in the TDDC tree using the global search ability of PSO/ACO. Secondly, we propose a technique that attempts to mitigate a major drawback of the TDDC approach. The drawback is that if at any point in the TDDC tree an example is misclassified it can never be correctly classified further down the TDDC tree. Our approach, PSO/ACO-RO (PSO/ACO-Recovery Optimisation) decides whether to redirect examples at a given classifier node using, again, the global search ability of PSO/ACO. Thirdly, we propose an ensemble based technique, HEHRS (Hierarchical Ensembles of Hierarchical Rule Sets), which attempts to boost the accuracy at each classifier node in the TDDC tree by using information from classifiers (rule sets) in the rest of that tree. We use Particle Swarm Optimisation to weight the individual rules within each ensemble. We evaluate these three new methods in hierarchical bioinformatics datasets that we have created for this research. These data sets represent the real world problem of protein function prediction. We find through extensive experimentation that the three proposed methods improve upon the baseline TDDC method to varying degrees. Overall the HEHRS and PSO/ACO- CS-RO approaches are most effective, although they are associated with a higher computational cost

    Analisis Seleksi Mahasiswa Baru Jalur Non-tulis Menggunakan Algoritma Ant-miner

    Get PDF
    Seleksi mahasiswa baru jalur non-tulis adalah salah satu dari jalur masuk Universitas dengan jumlah pendaftar yang tinggi. Tugas akhir ini membahas bagaimana cara melakukan seleksi mahasiswa baru jalur non-tulis dengan menggunakan rules yang didapat dari metode klasifikasi. Diharapkan rules hasil klasifikasi dapat digunakan untuk membantu mengevaluasi penerimaan mahasiswa baru dari jalur non-tulis. Klasifikasi terhadap data seleksi mahasiswa baru jalur non-tulis dapat dilakukan dengan algoritma ant-miner. Ant-miner (ant-colony based data miner) adalah algoritma yang digunakan untuk mengekstrak rules klasifikasi dari data dan telah memberikan hasil yang cukup memuaskan dalam beberapa jenis data kompleks yang telah diujikan. Tugas akhir ini juga bertujuan untuk menghasilkan analisis dari penggunaan algoritma ant-miner terhadap data seleksi mahasiswa baru jalur non-tulis. Hasil penelitian tugas akhir ini menunjukkan bahwa algoritma ant-miner menghasilkan akurasi training dan testing yang cukup baik, tidak overfitting, dan menghasilkan rules dengan akurasi, recall dan presisi yang baik sehingga dapat digunakan untuk mengevaluasi penerimaan mahasiswa baru jalur non-tulis
    corecore