602 research outputs found

    Improved Reptile Search Optimization Algorithm using Chaotic map and Simulated Annealing for Feature Selection in Medical Filed

    Get PDF
    The increased volume of medical datasets has produced high dimensional features, negatively affecting machine learning (ML) classifiers. In ML, the feature selection process is fundamental for selecting the most relevant features and reducing redundant and irrelevant ones. The optimization algorithms demonstrate its capability to solve feature selection problems. Reptile Search Algorithm (RSA) is a new nature-inspired optimization algorithm that stimulates Crocodiles’ encircling and hunting behavior. The unique search of the RSA algorithm obtains promising results compared to other optimization algorithms. However, when applied to high-dimensional feature selection problems, RSA suffers from population diversity and local optima limitations. An improved metaheuristic optimizer, namely the Improved Reptile Search Algorithm (IRSA), is proposed to overcome these limitations and adapt the RSA to solve the feature selection problem. Two main improvements adding value to the standard RSA; the first improvement is to apply the chaos theory at the initialization phase of RSA to enhance its exploration capabilities in the search space. The second improvement is to combine the Simulated Annealing (SA) algorithm with the exploitation search to avoid the local optima problem. The IRSA performance was evaluated over 20 medical benchmark datasets from the UCI machine learning repository. Also, IRSA is compared with the standard RSA and state-of-the-art optimization algorithms, including Particle Swarm Optimization (PSO), Genetic Algorithm (GA), Grasshopper Optimization algorithm (GOA) and Slime Mould Optimization (SMO). The evaluation metrics include the number of selected features, classification accuracy, fitness value, Wilcoxon statistical test (p-value), and convergence curve. Based on the results obtained, IRSA confirmed its superiority over the original RSA algorithm and other optimized algorithms on the majority of the medical datasets

    An Improved Binary Grey-Wolf Optimizer with Simulated Annealing for Feature Selection

    Get PDF
    This paper proposes improvements to the binary grey-wolf optimizer (BGWO) to solve the feature selection (FS) problem associated with high data dimensionality, irrelevant, noisy, and redundant data that will then allow machine learning algorithms to attain better classification/clustering accuracy in less training time. We propose three variants of BGWO in addition to the standard variant, applying different transfer functions to tackle the FS problem. Because BGWO generates continuous values and FS needs discrete values, a number of V-shaped, S-shaped, and U-shaped transfer functions were investigated for incorporation with BGWO to convert their continuous values to binary. After investigation, we note that the performance of BGWO is affected by the selection of the transfer function. Then, in the first variant, we look to reduce the local minima problem by integrating an exploration capability to update the position of the grey wolf randomly within the search space with a certain probability; this variant was abbreviated as IBGWO. Consequently, a novel mutation strategy is proposed to select a number of the worst grey wolves in the population which are updated toward the best solution and randomly within the search space based on a certain probability to determine if the update is either toward the best or randomly. The number of the worst grey wolf selected by this strategy is linearly increased with the iteration. Finally, this strategy is combined with IBGWO to produce the second variant of BGWO that was abbreviated as LIBGWO. In the last variant, simulated annealing (SA) was integrated with LIBGWO to search around the best-so-far solution at the end of each iteration in order to identify better solutions. The performance of the proposed variants was validated on 32 datasets taken from the UCI repository and compared with six wrapper feature selection methods. The experiments show the superiority of the proposed improved variants in producing better classification accuracy than the other selected wrapper feature selection algorithms

    A Novel Chaos Quasi-Oppositional based Flamingo Search Algorithm with Simulated Annealing for Feature Selection

    Get PDF
    In present situations feature selection is one of the most vital tasks in machine learning. Diminishing the feature set helps to increase the accuracy of the classifier. Due to large number of information’s present in the dataset it is a tremendous process to select the necessary features from the dataset. So, to solve this problem a novel Chaos Quasi-Oppositional based Flamingo Search Algorithm with Simulated Annealing algorithm (CQOFSA-SA) is proposed for feature selection and to select the optimal feature set from the datasets and thus it shrinks the dimension of the dataset. The FSA approach is used to choose the optimal feature subset from the dataset. In each iteration, the optimal solution of FSA is enriched by Simulated Annealing (SA). TheChaos Quasi-Oppositional based learning (CQOBL) included in the initialization of FSA improves the convergence rate and increases the searching capability of the FSA approach in choosing the optimal feature set. From the experimental outcomes, it is proved that the proposed CQOFSA-SA outperforms other feature selection approaches in terms of accuracy, optimal reduced feature set, fast convergence and fitness value

    Binary Multi-Verse Optimization (BMVO) Approaches for Feature Selection

    Get PDF
    Multi-Verse Optimization (MVO) is one of the newest meta-heuristic optimization algorithms which imitates the theory of Multi-Verse in Physics and resembles the interaction among the various universes. In problem domains like feature selection, the solutions are often constrained to the binary values viz. 0 and 1. With regard to this, in this paper, binary versions of MVO algorithm have been proposed with two prime aims: firstly, to remove redundant and irrelevant features from the dataset and secondly, to achieve better classification accuracy. The proposed binary versions use the concept of transformation functions for the mapping of a continuous version of the MVO algorithm to its binary versions. For carrying out the experiments, 21 diverse datasets have been used to compare the Binary MVO (BMVO) with some binary versions of existing metaheuristic algorithms. It has been observed that the proposed BMVO approaches have outperformed in terms of a number of features selected and the accuracy of the classification process

    Improved feature selection using a hybrid side-blotched lizard algorithm and genetic algorithm approach

    Get PDF
    Feature selection entails choosing the significant features among a wide collection of original features that are essential for predicting test data using a classifier. Feature selection is commonly used in various applications, such as bioinformatics, data mining, and the analysis of written texts, where the dataset contains tens or hundreds of thousands of features, making it difficult to analyze such a large feature set. Removing irrelevant features improves the predictor performance, making it more accurate and cost-effective. In this research, a novel hybrid technique is presented for feature selection that aims to enhance classification accuracy. A hybrid binary version of side-blotched lizard algorithm (SBLA) with genetic algorithm (GA), namely SBLAGA, which combines the strengths of both algorithms is proposed. We use a sigmoid function to adapt the continuous variables values into a binary one, and evaluate our proposed algorithm on twenty-three standard benchmark datasets. Average classification accuracy, average number of selected features and average fitness value were the evaluation criteria. According to the experimental results, SBLAGA demonstrated superior performance compared to SBLA and GA with regards to these criteria. We further compare SBLAGA with four wrapper feature selection methods that are widely used in the literature, and find it to be more efficient

    Unsupervised text Feature Selection using memetic Dichotomous Differential Evolution

    Get PDF
    Feature Selection (FS) methods have been studied extensively in the literature, and there are a crucial component in machine learning techniques. However, unsupervised text feature selection has not been well studied in document clustering problems. Feature selection could be modelled as an optimization problem due to the large number of possible solutions that might be valid. In this paper, a memetic method that combines Differential Evolution (DE) with Simulated Annealing (SA) for unsupervised FS was proposed. Due to the use of only two values indicating the existence or absence of the feature, a binary version of differential evolution is used. A dichotomous DE was used for the purpose of the binary version, and the proposed method is named Dichotomous Differential Evolution Simulated Annealing (DDESA). This method uses dichotomous mutation instead of using the standard mutation DE to be more effective for binary purposes. The Mean Absolute Distance (MAD) filter was used as the feature subset internal evaluation measure in this paper. The proposed method was compared with other state-of-the-art methods including the standard DE combined with SA, which is named DESA in this paper, using five benchmark datasets. The F-micro, F-macro (F-scores) and Average Distance of Document to Cluster (ADDC) measures were utilized as the evaluation measures. The Reduction Rate (RR) was also used as an evaluation measure. Test results showed that the proposed DDESA outperformed the other tested methods in performing the unsupervised text feature selection

    A hybrid swarm intelligence feature selection approach based on time-varying transition parameter

    Get PDF
    Feature selection aims to reduce the dimensionality of a dataset by removing superfluous attributes. This paper proposes a hybrid approach for feature selection problem by combining particle swarm optimization (PSO), grey wolf optimization (GWO), and tournament selection (TS) mechanism. Particle swarm enhances the diversification at the beginning of the search mechanism, grey wolf enhances the intensification at the end of the search mechanism, while tournament selection maintains diversification not only at the beginning but also at the end of the search process to achieve local optima avoidance. A time-varying transition parameter and a random variable are used to select either particle swarm, grey wolf, or tournament selection techniques during search process. This paper proposes different variants of this approach based on S-shaped and V-shaped transfer functions (TFs) to convert continuous solutions to binaries. These variants are named hybrid tournament grey wolf particle swarm (HTGWPS), followed by S or V letter to indicate the TF type, and followed by the TF’s number. These variants were evaluated using nine high-dimensional datasets. The results revealed that HTGWPS-V1 outperformed other V’s variants, PSO, and GWO on 78% of the datasets based on maximum classification accuracy obtained by a minimal feature subset. Also, HTGWPS-V1 outperformed six well-known-metaheuristics on 67% of the datasets

    A Scalable Feature Selection and Opinion Miner Using Whale Optimization Algorithm

    Get PDF
    Due to the fast-growing volume of text documents and reviews in recent years, current analyzing techniques are not competent enough to meet the users' needs. Using feature selection techniques not only support to understand data better but also lead to higher speed and also accuracy. In this article, the Whale Optimization algorithm is considered and applied to the search for the optimum subset of features. As known, F-measure is a metric based on precision and recall that is very popular in comparing classifiers. For the evaluation and comparison of the experimental results, PART, random tree, random forest, and RBF network classification algorithms have been applied to the different number of features. Experimental results show that the random forest has the best accuracy on 500 features. Keywords: Feature selection, Whale Optimization algorithm, Selecting optimal, Classification algorith
    • …
    corecore