312 research outputs found

    Cooperative co-evolution for feature selection in big data with random feature grouping

    Get PDF
    © 2020, The Author(s). A massive amount of data is generated with the evolution of modern technologies. This high-throughput data generation results in Big Data, which consist of many features (attributes). However, irrelevant features may degrade the classification performance of machine learning (ML) algorithms. Feature selection (FS) is a technique used to select a subset of relevant features that represent the dataset. Evolutionary algorithms (EAs) are widely used search strategies in this domain. A variant of EAs, called cooperative co-evolution (CC), which uses a divide-and-conquer approach, is a good choice for optimization problems. The existing solutions have poor performance because of some limitations, such as not considering feature interactions, dealing with only an even number of features, and decomposing the dataset statically. In this paper, a novel random feature grouping (RFG) has been introduced with its three variants to dynamically decompose Big Data datasets and to ensure the probability of grouping interacting features into the same subcomponent. RFG can be used in CC-based FS processes, hence called Cooperative Co-Evolutionary-Based Feature Selection with Random Feature Grouping (CCFSRFG). Experiment analysis was performed using six widely used ML classifiers on seven different datasets from the UCI ML repository and Princeton University Genomics repository with and without FS. The experimental results indicate that in most cases [i.e., with naïve Bayes (NB), support vector machine (SVM), k-Nearest Neighbor (k-NN), J48, and random forest (RF)] the proposed CCFSRFG-1 outperforms an existing solution (a CC-based FS, called CCEAFS) and CCFSRFG-2, and also when using all features in terms of accuracy, sensitivity, and specificity

    Decomposition for Large-scale Optimization Problems with Overlapping Components

    Get PDF
    In this paper we use a divide-and-conquer approach to tackle large-scale optimization problems with overlapping components. Decomposition for an overlapping problem is challenging as its components depend on one another. The existing decomposition methods typically assign all the linked decision variables into one group, thus cannot reduce the original problem size. To address this issue we modify the Recursive Differential Grouping (RDG) method to decompose overlapping problems, by breaking the linkage at variables shared by multiple components. To evaluate the efficacy of our method, we extend two existing overlapping benchmark problems considering various level of overlap. Experimental results show that our method can greatly improve the search ability of an optimization algorithm via divide-and-conquer, and outperforms RDG, random decomposition as well as other state-of-the-art methods. We further evaluate our method using the CEC'2013 benchmark problems and show that our method is very competitive when equipped with a component optimizer

    A review of population-based metaheuristics for large-scale black-box global optimization: Part A

    Get PDF
    Scalability of optimization algorithms is a major challenge in coping with the ever growing size of optimization problems in a wide range of application areas from high-dimensional machine learning to complex large-scale engineering problems. The field of large-scale global optimization is concerned with improving the scalability of global optimization algorithms, particularly population-based metaheuristics. Such metaheuristics have been successfully applied to continuous, discrete, or combinatorial problems ranging from several thousand dimensions to billions of decision variables. In this two-part survey, we review recent studies in the field of large-scale black-box global optimization to help researchers and practitioners gain a bird’s-eye view of the field, learn about its major trends, and the state-of-the-art algorithms. Part of the series covers two major algorithmic approaches to large-scale global optimization: problem decomposition and memetic algorithms. Part of the series covers a range of other algorithmic approaches to large-scale global optimization, describes a wide range of problem areas, and finally touches upon the pitfalls and challenges of current research and identifies several potential areas for future research
    • …
    corecore