83,982 research outputs found
Advances in metaheuristics for gene selection and classification of microarray data
Gene selection aims at identifying a (small) subset of informative genes from the initial data in order to obtain high predictive accuracy for classification. Gene selection can be considered as a combinatorial search problem and thus be conveniently handled with optimization methods. In this article, we summarize some recent developments of using metaheuristic-based methods within an embedded approach for gene selection. In particular, we put forward the importance and usefulness of integrating problem-specific knowledge into the search operators of such a method. To illustrate the point, we explain how ranking coefficients of a linear classifier such as support vector machine (SVM) can be profitably used to reinforce the search efficiency of Local Search and Evolutionary Search metaheuristic algorithms for gene selection and classification
OBKA-FS: an oppositional-based binary kidney-inspired search algorithm for feature selection
Feature selection is a key step when building an automatic classification system. Numerous evolutionary algorithms applied to remove irrelevant features in order to make the classifier perform more accurate. Kidney-inspired search algorithm (KA) is a very modern evolutionary algorithm. The original version of KA performed more effectively compared with other evolutionary algorithms. However, KA was proposed for continuous search spaces. For feature subset selection and many optimization problems such as classification, binary discrete space is required. Moreover, the movement operator of solutions is notably affected by its own best-known solution found up to now, denoted as Sbest. This may be inadequate if Sbest is located near a local optimum as it will direct the search process to a suboptimal solution. In this study, a three-fold improvement in the existing KA is proposed. First, a binary version of the kidney-inspired algorithm (BKA-FS) for feature subset selection is introduced to improve classification accuracy in multi-class classification problems. Second, the proposed BKA-FS is integrated into an oppositional-based initialization method in order to start with good initial solutions. Thus, this improved algorithm denoted as OBKA-FS. Third, a novel movement strategy based on the calculation of mutual information (MI), which gives OBKA-FS the ability to work in a discrete binary environment has been proposed. For evaluation, an experiment was conducted using ten UCI machine learning benchmark instances. Results show that OBKA-FS outperforms the existing state-of-the-art evolutionary algorithms for feature selection. In particular, OBKA-FS obtained better accuracy with same or fewer features and higher dependency with less redundancy. Thus, the results confirm the high performance of the improved kidney-inspired algorithm in solving optimization problems such as feature selection
Comparative Analysis of Multi-Objective Feature Subset Selection using Meta-Heuristic Techniques
ABSTRACT This paper presents a comparison of evolutionary algorithm based technique and swarm based technique to solve multi-objective feature subset selection problem. The data used for classification contains large number of features called attributes. Some of these attributes are not significant and need to be removed. In the process of classification, a feature effects accuracy, cost and learning time of the classifier. So, before building a classifier there is a strong need to choose a subset of the attributes (features). This research treats feature subset selection as multi-objective optimization problem. The latest multi-objective techniques have been used for the comparison of evolutionary and swarm based algorithms. These techniques are Non-dominated Sorting Genetic Algorithms (NSGA -II) and Multiobjective Particle Swarm Optimization (MOPSO).MOPSO has also been converted into Binary MOPSO (BMOPSO) in order to deal with feature subset selection. The fitness value of a particular feature subset is measured by using ID3. The testing accuracy acquired is then assigned to the fitness value. The techniques are tested on several datasets taken from the UCI machine repository. The experiments demonstrate the feasibility of treating feature subset selection as multi-objective problem. NSGA-II has proved to be a better option for solving feature subset selection problem than BMOPSO
Evolutionary Computation in Action: Feature Selection for Deep Embedding Spaces of Gigapixel Pathology Images
One of the main obstacles of adopting digital pathology is the challenge of
efficient processing of hyperdimensional digitized biopsy samples, called whole
slide images (WSIs). Exploiting deep learning and introducing compact WSI
representations are urgently needed to accelerate image analysis and facilitate
the visualization and interpretability of pathology results in a postpandemic
world. In this paper, we introduce a new evolutionary approach for WSI
representation based on large-scale multi-objective optimization (LSMOP) of
deep embeddings. We start with patch-based sampling to feed KimiaNet , a
histopathology-specialized deep network, and to extract a multitude of feature
vectors. Coarse multi-objective feature selection uses the reduced search space
strategy guided by the classification accuracy and the number of features. In
the second stage, the frequent features histogram (FFH), a novel WSI
representation, is constructed by multiple runs of coarse LSMOP. Fine
evolutionary feature selection is then applied to find a compact (short-length)
feature vector based on the FFH and contributes to a more robust deep-learning
approach to digital pathology supported by the stochastic power of evolutionary
algorithms. We validate the proposed schemes using The Cancer Genome Atlas
(TCGA) images in terms of WSI representation, classification accuracy, and
feature quality. Furthermore, a novel decision space for multicriteria decision
making in the LSMOP field is introduced. Finally, a patch-level visualization
approach is proposed to increase the interpretability of deep features. The
proposed evolutionary algorithm finds a very compact feature vector to
represent a WSI (almost 14,000 times smaller than the original feature vectors)
with 8% higher accuracy compared to the codes provided by the state-of-the-art
methods
Evolutionary Multiobjective Optimization Driven by Generative Adversarial Networks (GANs)
Recently, increasing works have proposed to drive evolutionary algorithms
using machine learning models. Usually, the performance of such model based
evolutionary algorithms is highly dependent on the training qualities of the
adopted models. Since it usually requires a certain amount of data (i.e. the
candidate solutions generated by the algorithms) for model training, the
performance deteriorates rapidly with the increase of the problem scales, due
to the curse of dimensionality. To address this issue, we propose a
multi-objective evolutionary algorithm driven by the generative adversarial
networks (GANs). At each generation of the proposed algorithm, the parent
solutions are first classified into real and fake samples to train the GANs;
then the offspring solutions are sampled by the trained GANs. Thanks to the
powerful generative ability of the GANs, our proposed algorithm is capable of
generating promising offspring solutions in high-dimensional decision space
with limited training data. The proposed algorithm is tested on 10 benchmark
problems with up to 200 decision variables. Experimental results on these test
problems demonstrate the effectiveness of the proposed algorithm
- …