83,982 research outputs found

    Advances in metaheuristics for gene selection and classification of microarray data

    Get PDF
    Gene selection aims at identifying a (small) subset of informative genes from the initial data in order to obtain high predictive accuracy for classification. Gene selection can be considered as a combinatorial search problem and thus be conveniently handled with optimization methods. In this article, we summarize some recent developments of using metaheuristic-based methods within an embedded approach for gene selection. In particular, we put forward the importance and usefulness of integrating problem-specific knowledge into the search operators of such a method. To illustrate the point, we explain how ranking coefficients of a linear classifier such as support vector machine (SVM) can be profitably used to reinforce the search efficiency of Local Search and Evolutionary Search metaheuristic algorithms for gene selection and classification

    OBKA-FS: an oppositional-based binary kidney-inspired search algorithm for feature selection

    Get PDF
    Feature selection is a key step when building an automatic classification system. Numerous evolutionary algorithms applied to remove irrelevant features in order to make the classifier perform more accurate. Kidney-inspired search algorithm (KA) is a very modern evolutionary algorithm. The original version of KA performed more effectively compared with other evolutionary algorithms. However, KA was proposed for continuous search spaces. For feature subset selection and many optimization problems such as classification, binary discrete space is required. Moreover, the movement operator of solutions is notably affected by its own best-known solution found up to now, denoted as Sbest. This may be inadequate if Sbest is located near a local optimum as it will direct the search process to a suboptimal solution. In this study, a three-fold improvement in the existing KA is proposed. First, a binary version of the kidney-inspired algorithm (BKA-FS) for feature subset selection is introduced to improve classification accuracy in multi-class classification problems. Second, the proposed BKA-FS is integrated into an oppositional-based initialization method in order to start with good initial solutions. Thus, this improved algorithm denoted as OBKA-FS. Third, a novel movement strategy based on the calculation of mutual information (MI), which gives OBKA-FS the ability to work in a discrete binary environment has been proposed. For evaluation, an experiment was conducted using ten UCI machine learning benchmark instances. Results show that OBKA-FS outperforms the existing state-of-the-art evolutionary algorithms for feature selection. In particular, OBKA-FS obtained better accuracy with same or fewer features and higher dependency with less redundancy. Thus, the results confirm the high performance of the improved kidney-inspired algorithm in solving optimization problems such as feature selection

    Comparative Analysis of Multi-Objective Feature Subset Selection using Meta-Heuristic Techniques

    Get PDF
    ABSTRACT This paper presents a comparison of evolutionary algorithm based technique and swarm based technique to solve multi-objective feature subset selection problem. The data used for classification contains large number of features called attributes. Some of these attributes are not significant and need to be removed. In the process of classification, a feature effects accuracy, cost and learning time of the classifier. So, before building a classifier there is a strong need to choose a subset of the attributes (features). This research treats feature subset selection as multi-objective optimization problem. The latest multi-objective techniques have been used for the comparison of evolutionary and swarm based algorithms. These techniques are Non-dominated Sorting Genetic Algorithms (NSGA -II) and Multiobjective Particle Swarm Optimization (MOPSO).MOPSO has also been converted into Binary MOPSO (BMOPSO) in order to deal with feature subset selection. The fitness value of a particular feature subset is measured by using ID3. The testing accuracy acquired is then assigned to the fitness value. The techniques are tested on several datasets taken from the UCI machine repository. The experiments demonstrate the feasibility of treating feature subset selection as multi-objective problem. NSGA-II has proved to be a better option for solving feature subset selection problem than BMOPSO

    Evolutionary Computation in Action: Feature Selection for Deep Embedding Spaces of Gigapixel Pathology Images

    Full text link
    One of the main obstacles of adopting digital pathology is the challenge of efficient processing of hyperdimensional digitized biopsy samples, called whole slide images (WSIs). Exploiting deep learning and introducing compact WSI representations are urgently needed to accelerate image analysis and facilitate the visualization and interpretability of pathology results in a postpandemic world. In this paper, we introduce a new evolutionary approach for WSI representation based on large-scale multi-objective optimization (LSMOP) of deep embeddings. We start with patch-based sampling to feed KimiaNet , a histopathology-specialized deep network, and to extract a multitude of feature vectors. Coarse multi-objective feature selection uses the reduced search space strategy guided by the classification accuracy and the number of features. In the second stage, the frequent features histogram (FFH), a novel WSI representation, is constructed by multiple runs of coarse LSMOP. Fine evolutionary feature selection is then applied to find a compact (short-length) feature vector based on the FFH and contributes to a more robust deep-learning approach to digital pathology supported by the stochastic power of evolutionary algorithms. We validate the proposed schemes using The Cancer Genome Atlas (TCGA) images in terms of WSI representation, classification accuracy, and feature quality. Furthermore, a novel decision space for multicriteria decision making in the LSMOP field is introduced. Finally, a patch-level visualization approach is proposed to increase the interpretability of deep features. The proposed evolutionary algorithm finds a very compact feature vector to represent a WSI (almost 14,000 times smaller than the original feature vectors) with 8% higher accuracy compared to the codes provided by the state-of-the-art methods

    Evolutionary Multiobjective Optimization Driven by Generative Adversarial Networks (GANs)

    Get PDF
    Recently, increasing works have proposed to drive evolutionary algorithms using machine learning models. Usually, the performance of such model based evolutionary algorithms is highly dependent on the training qualities of the adopted models. Since it usually requires a certain amount of data (i.e. the candidate solutions generated by the algorithms) for model training, the performance deteriorates rapidly with the increase of the problem scales, due to the curse of dimensionality. To address this issue, we propose a multi-objective evolutionary algorithm driven by the generative adversarial networks (GANs). At each generation of the proposed algorithm, the parent solutions are first classified into real and fake samples to train the GANs; then the offspring solutions are sampled by the trained GANs. Thanks to the powerful generative ability of the GANs, our proposed algorithm is capable of generating promising offspring solutions in high-dimensional decision space with limited training data. The proposed algorithm is tested on 10 benchmark problems with up to 200 decision variables. Experimental results on these test problems demonstrate the effectiveness of the proposed algorithm
    corecore