1,235 research outputs found

    Attribute Selection Methods in Rough Set Theory

    Get PDF
    Attribute selection for rough sets is an NP-hard problem, in which fast heuristic algorithms are needed to find reducts. In this project, two reduct methods for rough set were implemented: particle swarm optimization and Johnson’s method. Both algorithms were evaluated with five different benchmarks from the KEEL repository. The results obtained from both implementations were compared with results obtained by the ROSETTA software using the same benchmarks. The results show that the implementations achieve better correction rates than ROSETTA

    Feature Selection via Chaotic Antlion Optimization

    Get PDF
    Selecting a subset of relevant properties from a large set of features that describe a dataset is a challenging machine learning task. In biology, for instance, the advances in the available technologies enable the generation of a very large number of biomarkers that describe the data. Choosing the more informative markers along with performing a high-accuracy classification over the data can be a daunting task, particularly if the data are high dimensional. An often adopted approach is to formulate the feature selection problem as a biobjective optimization problem, with the aim of maximizing the performance of the data analysis model (the quality of the data training fitting) while minimizing the number of features used.This work was partially supported by the IPROCOM Marie Curie initial training network, funded through the People Programme (Marie Curie Actions) of the European Union’s Seventh Framework Programme FP7/2007-2013/ under REA grants agreement No. 316555, and by the Romanian National Authority for Scientific Research, CNDIUEFISCDI, project number PN-II-PT-PCCA-2011-3.2- 0917. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

    New Fitness Functions in Binary Particle Swarm Optimisation for Feature Selection

    Get PDF
    Abstract-Feature selection is an important data preprocessing technique in classification problems. This paper proposes two new fitness functions in binary particle swarm optimisation (BPSO) for feature selection to choose a small number of features and achieve high classification accuracy. In the first fitness function, the relative importance of classification performance and the number of features are balanced by using a linearly increasing weight in the evolutionary process. The second is a two-stage fitness function, where classification performance is optimised in the first stage and the number of features is taken into account in the second stage. K-nearest neighbour (KNN) is employed to evaluate the classification performance in the experiments on ten datasets. Experimental results show that by using either of the two proposed fitness functions in the training process, in almost all cases, BPSO can select a smaller number of features and achieve higher classification accuracy on the test sets than using overall classification performance as the fitness function. They outperform two conventional feature selection methods in almost all cases. In most cases, BPSO with the second fitness function can achieve better performance than with the first fitness function in terms of classification accuracy and the number of features

    A Survey on Evolutionary Computation Approaches to Feature Selection

    Get PDF
    Feature selection is an important task in data mining and machine learning to reduce the dimensionality of the data and increase the performance of an algorithm, such as a classification algorithm. However, feature selection is a challenging task due mainly to the large search space. A variety of methods have been applied to solve feature selection problems, where evolutionary computation (EC) techniques have recently gained much attention and shown some success. However, there are no comprehensive guidelines on the strengths and weaknesses of alternative approaches. This leads to a disjointed and fragmented field with ultimately lost opportunities for improving performance and successful applications. This paper presents a comprehensive survey of the state-of-the-art work on EC for feature selection, which identifies the contributions of these different algorithms. In addition, current issues and challenges are also discussed to identify promising areas for future research.</p

    Integrated bio-search approaches with multi-objective algorithms for optimization and classification problem

    Get PDF
    Optimal selection of features is very difficult and crucial to achieve, particularly for the task of classification. It is due to the traditional method of selecting features that function independently and generated the collection of irrelevant features, which therefore affects the quality of the accuracy of the classification. The goal of this paper is to leverage the potential of bio-inspired search algorithms, together with wrapper, in optimizing multi-objective algorithms, namely ENORA and NSGA-II to generate an optimal set of features. The main steps are to idealize the combination of ENORA and NSGA-II with suitable bio-search algorithms where multiple subset generation has been implemented. The next step is to validate the optimum feature set by conducting a subset evaluation. Eight (8) comparison datasets of various sizes have been deliberately selected to be checked. Results shown that the ideal combination of multi-objective algorithms, namely ENORA and NSGA-II, with the selected bio-inspired search algorithm is promising to achieve a better optimal solution (i.e. a best features with higher classification accuracy) for the selected datasets. This discovery implies that the ability of bio-inspired wrapper/filtered system algorithms will boost the efficiency of ENORA and NSGA-II for the task of selecting and classifying features
    • …
    corecore