9,490 research outputs found

    An Improved Binary Grey-Wolf Optimizer with Simulated Annealing for Feature Selection

    Get PDF
    This paper proposes improvements to the binary grey-wolf optimizer (BGWO) to solve the feature selection (FS) problem associated with high data dimensionality, irrelevant, noisy, and redundant data that will then allow machine learning algorithms to attain better classification/clustering accuracy in less training time. We propose three variants of BGWO in addition to the standard variant, applying different transfer functions to tackle the FS problem. Because BGWO generates continuous values and FS needs discrete values, a number of V-shaped, S-shaped, and U-shaped transfer functions were investigated for incorporation with BGWO to convert their continuous values to binary. After investigation, we note that the performance of BGWO is affected by the selection of the transfer function. Then, in the first variant, we look to reduce the local minima problem by integrating an exploration capability to update the position of the grey wolf randomly within the search space with a certain probability; this variant was abbreviated as IBGWO. Consequently, a novel mutation strategy is proposed to select a number of the worst grey wolves in the population which are updated toward the best solution and randomly within the search space based on a certain probability to determine if the update is either toward the best or randomly. The number of the worst grey wolf selected by this strategy is linearly increased with the iteration. Finally, this strategy is combined with IBGWO to produce the second variant of BGWO that was abbreviated as LIBGWO. In the last variant, simulated annealing (SA) was integrated with LIBGWO to search around the best-so-far solution at the end of each iteration in order to identify better solutions. The performance of the proposed variants was validated on 32 datasets taken from the UCI repository and compared with six wrapper feature selection methods. The experiments show the superiority of the proposed improved variants in producing better classification accuracy than the other selected wrapper feature selection algorithms

    Feature Selection Using Hybrid Binary Grey Wolf Optimizer for Arabic Text Classification

    Get PDF
    Feature selection in Arabic text is a challenging task due to the complex and rich nature of Arabic. The feature selection requires solution quality, stability, conver- gence speed, and the ability to find the global optimal. This study proposes a feature selection method using the Hybrid Binary Gray Wolf Optimizer (HBGWO) for Ara- bic text classification. The HBGWO method combines the local search capabilities or exploratory of the BGWO and the search capabilities around the best solutions or exploits of the PSO. HBGWO method also combines SCA’s capabilities in finding global solutions. The data set used Arabic text from islambook.com, which consists of five Hadith books. The books selected five classes: Tauhid, Prayer, Zakat, Fasting, and Hajj. The results showed that the BGWO-PSO-SCA feature selection method with the fitness function search and classification method using SVM could per- form better on Arabic text classification problems. BGWO-PSO with fitness function and the classification method using SVM (C=1.0) gives a high accuracy value of 76.37% compared to without feature selection. The BGWO-PSO-SCA feature selec- tion method provides an accuracy value of 88.08%. This accuracy value is higher than the BGWO-PSO feature selection and other feature selection methods

    A hybrid swarm intelligence feature selection approach based on time-varying transition parameter

    Get PDF
    Feature selection aims to reduce the dimensionality of a dataset by removing superfluous attributes. This paper proposes a hybrid approach for feature selection problem by combining particle swarm optimization (PSO), grey wolf optimization (GWO), and tournament selection (TS) mechanism. Particle swarm enhances the diversification at the beginning of the search mechanism, grey wolf enhances the intensification at the end of the search mechanism, while tournament selection maintains diversification not only at the beginning but also at the end of the search process to achieve local optima avoidance. A time-varying transition parameter and a random variable are used to select either particle swarm, grey wolf, or tournament selection techniques during search process. This paper proposes different variants of this approach based on S-shaped and V-shaped transfer functions (TFs) to convert continuous solutions to binaries. These variants are named hybrid tournament grey wolf particle swarm (HTGWPS), followed by S or V letter to indicate the TF type, and followed by the TF’s number. These variants were evaluated using nine high-dimensional datasets. The results revealed that HTGWPS-V1 outperformed other V’s variants, PSO, and GWO on 78% of the datasets based on maximum classification accuracy obtained by a minimal feature subset. Also, HTGWPS-V1 outperformed six well-known-metaheuristics on 67% of the datasets

    Hybrid feature selection based on principal component analysis and grey wolf optimizer algorithm for Arabic news article classification

    Get PDF
    The rapid growth of electronic documents has resulted from the expansion and development of internet technologies. Text-documents classification is a key task in natural language processing that converts unstructured data into structured form and then extract knowledge from it. This conversion generates a high dimensional data that needs further analusis using data mining techniques like feature extraction, feature selection, and classification to derive meaningful insights from the data. Feature selection is a technique used for reducing dimensionality in order to prune the feature space and, as a result, lowering the computational cost and enhancing classification accuracy. This work presents a hybrid filter-wrapper method based on Principal Component Analysis (PCA) as a filter approach to select an appropriate and informative subset of features and Grey Wolf Optimizer (GWO) as wrapper approach (PCA-GWO) to select further informative features. Logistic Regression (LR) is used as an elevator to test the classification accuracy of candidate feature subsets produced by GWO. Three Arabic datasets, namely Alkhaleej, Akhbarona, and Arabiya, are used to assess the efficiency of the proposed method. The experimental results confirm that the proposed method based on PCA-GWO outperforms the baseline classifiers with/without feature selection and other feature selection approaches in terms of classification accuracy

    Applications of Nature-Inspired Algorithms for Dimension Reduction: Enabling Efficient Data Analytics

    Get PDF
    In [1], we have explored the theoretical aspects of feature selection and evolutionary algorithms. In this chapter, we focus on optimization algorithms for enhancing data analytic process, i.e., we propose to explore applications of nature-inspired algorithms in data science. Feature selection optimization is a hybrid approach leveraging feature selection techniques and evolutionary algorithms process to optimize the selected features. Prior works solve this problem iteratively to converge to an optimal feature subset. Feature selection optimization is a non-specific domain approach. Data scientists mainly attempt to find an advanced way to analyze data n with high computational efficiency and low time complexity, leading to efficient data analytics. Thus, by increasing generated/measured/sensed data from various sources, analysis, manipulation and illustration of data grow exponentially. Due to the large scale data sets, Curse of dimensionality (CoD) is one of the NP-hard problems in data science. Hence, several efforts have been focused on leveraging evolutionary algorithms (EAs) to address the complex issues in large scale data analytics problems. Dimension reduction, together with EAs, lends itself to solve CoD and solve complex problems, in terms of time complexity, efficiently. In this chapter, we first provide a brief overview of previous studies that focused on solving CoD using feature extraction optimization process. We then discuss practical examples of research studies are successfully tackled some application domains, such as image processing, sentiment analysis, network traffics / anomalies analysis, credit score analysis and other benchmark functions/data sets analysis

    Improved feature selection using a hybrid side-blotched lizard algorithm and genetic algorithm approach

    Get PDF
    Feature selection entails choosing the significant features among a wide collection of original features that are essential for predicting test data using a classifier. Feature selection is commonly used in various applications, such as bioinformatics, data mining, and the analysis of written texts, where the dataset contains tens or hundreds of thousands of features, making it difficult to analyze such a large feature set. Removing irrelevant features improves the predictor performance, making it more accurate and cost-effective. In this research, a novel hybrid technique is presented for feature selection that aims to enhance classification accuracy. A hybrid binary version of side-blotched lizard algorithm (SBLA) with genetic algorithm (GA), namely SBLAGA, which combines the strengths of both algorithms is proposed. We use a sigmoid function to adapt the continuous variables values into a binary one, and evaluate our proposed algorithm on twenty-three standard benchmark datasets. Average classification accuracy, average number of selected features and average fitness value were the evaluation criteria. According to the experimental results, SBLAGA demonstrated superior performance compared to SBLA and GA with regards to these criteria. We further compare SBLAGA with four wrapper feature selection methods that are widely used in the literature, and find it to be more efficient
    • …
    corecore