Search CORE

11,226 research outputs found

Cooperative co-evolution for feature selection in big data with random feature grouping

Author: Ahmed Mohiuddin
Haskell-Dowland Paul
Rashid A.N.M. Bazlur
Sikos Leslie F.
Publication venue: Edith Cowan University, Research Online, Perth, Western Australia
Publication date: 01/01/2020
Field of study

© 2020, The Author(s). A massive amount of data is generated with the evolution of modern technologies. This high-throughput data generation results in Big Data, which consist of many features (attributes). However, irrelevant features may degrade the classification performance of machine learning (ML) algorithms. Feature selection (FS) is a technique used to select a subset of relevant features that represent the dataset. Evolutionary algorithms (EAs) are widely used search strategies in this domain. A variant of EAs, called cooperative co-evolution (CC), which uses a divide-and-conquer approach, is a good choice for optimization problems. The existing solutions have poor performance because of some limitations, such as not considering feature interactions, dealing with only an even number of features, and decomposing the dataset statically. In this paper, a novel random feature grouping (RFG) has been introduced with its three variants to dynamically decompose Big Data datasets and to ensure the probability of grouping interacting features into the same subcomponent. RFG can be used in CC-based FS processes, hence called Cooperative Co-Evolutionary-Based Feature Selection with Random Feature Grouping (CCFSRFG). Experiment analysis was performed using six widely used ML classifiers on seven different datasets from the UCI ML repository and Princeton University Genomics repository with and without FS. The experimental results indicate that in most cases [i.e., with naïve Bayes (NB), support vector machine (SVM), k-Nearest Neighbor (k-NN), J48, and random forest (RF)] the proposed CCFSRFG-1 outperforms an existing solution (a CC-based FS, called CCEAFS) and CCFSRFG-2, and also when using all features in terms of accuracy, sensitivity, and specificity

Research Online @ ECU

Decomposition for Large-scale Optimization Problems with Overlapping Components

Author: Ernst A
Li X
Omidvar MN
Sun Y
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2019
Field of study

In this paper we use a divide-and-conquer approach to tackle large-scale optimization problems with overlapping components. Decomposition for an overlapping problem is challenging as its components depend on one another. The existing decomposition methods typically assign all the linked decision variables into one group, thus cannot reduce the original problem size. To address this issue we modify the Recursive Differential Grouping (RDG) method to decompose overlapping problems, by breaking the linkage at variables shared by multiple components. To evaluate the efficacy of our method, we extend two existing overlapping benchmark problems considering various level of overlap. Experimental results show that our method can greatly improve the search ability of an optimization algorithm via divide-and-conquer, and outperforms RDG, random decomposition as well as other state-of-the-art methods. We further evaluate our method using the CEC'2013 benchmark problems and show that our method is very competitive when equipped with a component optimizer

Crossref

White Rose Research Online

Challenging High Dimensionality in Evolutionary Optimization using Cooperative Co-evolutionary Algorithms

Author: Blanchard Julien
Publication venue
Publication date: 29/06/2021
Field of study

Repository of the University of Namur

Knowledge management overview of feature selection problem in high-dimensional financial data: Cooperative co-evolution and Map Reduce perspectives

Author: Bazlur Rashid A. N. M.
Choudhury Tonmoy
Publication venue: Edith Cowan University, Research Online, Perth, Western Australia
Publication date: 01/01/2019
Field of study

The term big data characterizes the massive amounts of data generation by the advanced technologies in different domains using 4Vs volume, velocity, variety, and veracity-to indicate the amount of data that can only be processed via computationally intensive analysis, the speed of their creation, the different types of data, and their accuracy. High-dimensional financial data, such as time-series and space-Time data, contain a large number of features (variables) while having a small number of samples, which are used to measure various real-Time business situations for financial organizations. Such datasets are normally noisy, and complex correlations may exist between their features, and many domains, including financial, lack the al analytic tools to mine the data for knowledge discovery because of the high-dimensionality. Feature selection is an optimization problem to find a minimal subset of relevant features that maximizes the classification accuracy and reduces the computations. Traditional statistical-based feature selection approaches are not adequate to deal with the curse of dimensionality associated with big data. Cooperative co-evolution, a meta-heuristic algorithm and a divide-And-conquer approach, decomposes high-dimensional problems into smaller sub-problems. Further, MapReduce, a programming model, offers a ready-To-use distributed, scalable, and fault-Tolerant infrastructure for parallelizing the developed algorithm. This article presents a knowledge management overview of evolutionary feature selection approaches, state-of-The-Art cooperative co-evolution and MapReduce-based feature selection techniques, and future research directions

Research Online @ ECU

A review of population-based metaheuristics for large-scale black-box global optimization: Part A

Author: Li X
Omidvar MN
Yao X
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/11/2021
Field of study

Scalability of optimization algorithms is a major challenge in coping with the ever growing size of optimization problems in a wide range of application areas from high-dimensional machine learning to complex large-scale engineering problems. The field of large-scale global optimization is concerned with improving the scalability of global optimization algorithms, particularly population-based metaheuristics. Such metaheuristics have been successfully applied to continuous, discrete, or combinatorial problems ranging from several thousand dimensions to billions of decision variables. In this two-part survey, we review recent studies in the field of large-scale black-box global optimization to help researchers and practitioners gain a bird’s-eye view of the field, learn about its major trends, and the state-of-the-art algorithms. Part of the series covers two major algorithmic approaches to large-scale global optimization: problem decomposition and memetic algorithms. Part of the series covers a range of other algorithmic approaches to large-scale global optimization, describes a wide range of problem areas, and finally touches upon the pitfalls and challenges of current research and identifies several potential areas for future research

White Rose Research Online

Cortical Dynamics of Navigation and Steering in Natural Scenes: Motion-Based Object Segmentation, Heading, and Obstacle Avoidance

Author: Browning Andrew N.
Grossberg Stephen
Mingolla Ennio
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/12/2008
Field of study

Visually guided navigation through a cluttered natural scene is a challenging problem that animals and humans accomplish with ease. The ViSTARS neural model proposes how primates use motion information to segment objects and determine heading for purposes of goal approach and obstacle avoidance in response to video inputs from real and virtual environments. The model produces trajectories similar to those of human navigators. It does so by predicting how computationally complementary processes in cortical areas MT-/MSTv and MT+/MSTd compute object motion for tracking and self-motion for navigation, respectively. The model retina responds to transients in the input stream. Model V1 generates a local speed and direction estimate. This local motion estimate is ambiguous due to the neural aperture problem. Model MT+ interacts with MSTd via an attentive feedback loop to compute accurate heading estimates in MSTd that quantitatively simulate properties of human heading estimation data. Model MT interacts with MSTv via an attentive feedback loop to compute accurate estimates of speed, direction and position of moving objects. This object information is combined with heading information to produce steering decisions wherein goals behave like attractors and obstacles behave like repellers. These steering decisions lead to navigational trajectories that closely match human performance.National Science Foundation (SBE-0354378, BCS-0235398); Office of Naval Research (N00014-01-1-0624); National Geospatial Intelligence Agency (NMA201-01-1-2016

Boston University Institutional Repository (OpenBU)

DG2: A Faster and More Accurate Differential Grouping for Large-Scale Black-Box Optimization

Author: Li X
Mei Y
Omidvar MN
Yang M
Yao X
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Identification of variable interaction is essential for an efficient implementation of a divide-and-conquer algorithm for large-scale black-box optimization. In this paper, we propose an improved variant of the differential grouping (DG) algorithm, which has a better efficiency and grouping accuracy. The proposed algorithm, DG2, finds a reliable threshold value by estimating the magnitude of roundoff errors. With respect to efficiency, DG2 reuses the sample points that are generated for detecting interactions and saves up to half of the computational resources on fully separable functions. We mathematically show that the new sampling technique achieves the lower bound with respect to the number of function evaluations. Unlike its predecessor, DG2 checks all possible pairs of variables for interactions and has the capacity to identify overlapping components of an objective function. On the accuracy aspect, DG2 outperforms the state-of-the-art decomposition methods on the latest large-scale continuous optimization benchmark suites. DG2 also performs reliably in the presence of imbalance among contribution of components in an objective function. Another major advantage of DG2 is the automatic calculation of its threshold parameter (

\epsilon

), which makes it parameter-free. Finally, the experimental results show that when DG2 is used within a cooperative co-evolutionary framework, it can generate competitive results as compared to several state-of-the-art algorithms

Crossref

University of Birmingham Research Portal

RMIT Research Repository

White Rose Research Online