158 research outputs found

    Attribute Equilibrium Dominance Reduction Accelerator (DCCAEDR) Based on Distributed Coevolutionary Cloud and Its Application in Medical Records

    Full text link
    © 2013 IEEE. Aimed at the tremendous challenge of attribute reduction for big data mining and knowledge discovery, we propose a new attribute equilibrium dominance reduction accelerator (DCCAEDR) based on the distributed coevolutionary cloud model. First, the framework of N-populations distributed coevolutionary MapReduce model is designed to divide the entire population into N subpopulations, sharing the reward of different subpopulations' solutions under a MapReduce cloud mechanism. Because the adaptive balancing between exploration and exploitation can be achieved in a better way, the reduction performance is guaranteed to be the same as those using the whole independent data set. Second, a novel Nash equilibrium dominance strategy of elitists under the N bounded rationality regions is adopted to assist the subpopulations necessary to attain the stable status of Nash equilibrium dominance. This further enhances the accelerator's robustness against complex noise on big data. Third, the approximation parallelism mechanism based on MapReduce is constructed to implement rule reduction by accelerating the computation of attribute equivalence classes. Consequently, the entire attribute reduction set with the equilibrium dominance solution can be achieved. Extensive simulation results have been used to illustrate the effectiveness and robustness of the proposed DCCAEDR accelerator for attribute reduction on big data. Furthermore, the DCCAEDR is applied to solve attribute reduction for traditional Chinese medical records and to segment cortical surfaces of the neonatal brain 3-D-MRI records, and the DCCAEDR shows the superior competitive results, when compared with the representative algorithms

    Multiple Relevant Feature Ensemble Selection Based on Multilayer Co-Evolutionary Consensus MapReduce

    Full text link
    IEEE Although feature selection for large data has been intensively investigated in data mining, machine learning, and pattern recognition, the challenges are not just to invent new algorithms to handle noisy and uncertain large data in applications, but rather to link the multiple relevant feature sources, structured, or unstructured, to develop an effective feature reduction method. In this paper, we propose a multiple relevant feature ensemble selection (MRFES) algorithm based on multilayer co-evolutionary consensus MapReduce (MCCM). We construct an effective MCCM model to handle feature ensemble selection of large-scale datasets with multiple relevant feature sources, and explore the unified consistency aggregation between the local solutions and global dominance solutions achieved by the co-evolutionary memeplexes, which participate in the cooperative feature ensemble selection process. This model attempts to reach a mutual decision agreement among co-evolutionary memeplexes, which calls for the need for mechanisms to detect some noncooperative co-evolutionary behaviors and achieve better Nash equilibrium resolutions. Extensive experimental comparative studies substantiate the effectiveness of MRFES to solve large-scale dataset problems with the complex noise and multiple relevant feature sources on some well-known benchmark datasets. The algorithm can greatly facilitate the selection of relevant feature subsets coming from the original feature space with better accuracy, efficiency, and interpretability. Moreover, we apply MRFES to human cerebral cortex-based classification prediction. Such successful applications are expected to significantly scale up classification prediction for large-scale and complex brain data in terms of efficiency and feasibility

    Multigranulation Super-Trust Model for Attribute Reduction

    Get PDF
    IEEE As big data often contains a significant amount of uncertain, unstructured and imprecise data that are structurally complex and incomplete, traditional attribute reduction methods are less effective when applied to large-scale incomplete information systems to extract knowledge. Multigranular computing provides a powerful tool for use in big data analysis conducted at different levels of information granularity. In this paper, we present a novel multigranulation super-trust fuzzy-rough set-based attribute reduction (MSFAR) algorithm to support the formation of hierarchies of information granules of higher types and higher orders, which addresses newly emerging data mining problems in big data analysis. First, a multigranulation super-trust model based on the valued tolerance relation is constructed to identify the fuzzy similarity of the changing knowledge granularity with multimodality attributes. Second, an ensemble consensus compensatory scheme is adopted to calculate the multigranular trust degree based on the reputation at different granularities to create reasonable subproblems with different granulation levels. Third, an equilibrium method of multigranular-coevolution is employed to ensure a wide range of balancing of exploration and exploitation and can classify super elitists’ preferences and detect noncooperative behaviors with a global convergence ability and high search accuracy. The experimental results demonstrate that the MSFAR algorithm achieves a high performance in addressing uncertain and fuzzy attribute reduction problems with a large number of multigranularity variables

    A review of population-based metaheuristics for large-scale black-box global optimization: Part B

    Get PDF
    This paper is the second part of a two-part survey series on large-scale global optimization. The first part covered two major algorithmic approaches to large-scale optimization, namely decomposition methods and hybridization methods such as memetic algorithms and local search. In this part we focus on sampling and variation operators, approximation and surrogate modeling, initialization methods, and parallelization. We also cover a range of problem areas in relation to large-scale global optimization, such as multi-objective optimization, constraint handling, overlapping components, the component imbalance issue, and benchmarks, and applications. The paper also includes a discussion on pitfalls and challenges of current research and identifies several potential areas of future research

    A Comprehensive Survey on Particle Swarm Optimization Algorithm and Its Applications

    Get PDF
    Particle swarm optimization (PSO) is a heuristic global optimization method, proposed originally by Kennedy and Eberhart in 1995. It is now one of the most commonly used optimization techniques. This survey presented a comprehensive investigation of PSO. On one hand, we provided advances with PSO, including its modifications (including quantum-behaved PSO, bare-bones PSO, chaotic PSO, and fuzzy PSO), population topology (as fully connected, von Neumann, ring, star, random, etc.), hybridization (with genetic algorithm, simulated annealing, Tabu search, artificial immune system, ant colony algorithm, artificial bee colony, differential evolution, harmonic search, and biogeography-based optimization), extensions (to multiobjective, constrained, discrete, and binary optimization), theoretical analysis (parameter selection and tuning, and convergence analysis), and parallel implementation (in multicore, multiprocessor, GPU, and cloud computing forms). On the other hand, we offered a survey on applications of PSO to the following eight fields: electrical and electronic engineering, automation control systems, communication theory, operations research, mechanical engineering, fuel and energy, medicine, chemistry, and biology. It is hoped that this survey would be beneficial for the researchers studying PSO algorithms

    Towards a more efficient use of computational budget in large-scale black-box optimization

    Get PDF
    Evolutionary algorithms are general purpose optimizers that have been shown effective in solving a variety of challenging optimization problems. In contrast to mathematical programming models, evolutionary algorithms do not require derivative information and are still effective when the algebraic formula of the given problem is unavailable. Nevertheless, the rapid advances in science and technology have witnessed the emergence of more complex optimization problems than ever, which pose significant challenges to traditional optimization methods. The dimensionality of the search space of an optimization problem when the available computational budget is limited is one of the main contributors to its difficulty and complexity. This so-called curse of dimensionality can significantly affect the efficiency and effectiveness of optimization methods including evolutionary algorithms. This research aims to study two topics related to a more efficient use of computational budget in evolutionary algorithms when solving large-scale black-box optimization problems. More specifically, we study the role of population initializers in saving the computational resource, and computational budget allocation in cooperative coevolutionary algorithms. Consequently, this dissertation consists of two major parts, each of which relates to one of these research directions. In the first part, we review several population initialization techniques that have been used in evolutionary algorithms. Then, we categorize them from different perspectives. The contribution of each category to improving evolutionary algorithms in solving large-scale problems is measured. We also study the mutual effect of population size and initialization technique on the performance of evolutionary techniques when dealing with large-scale problems. Finally, assuming uniformity of initial population as a key contributor in saving a significant part of the computational budget, we investigate whether achieving a high-level of uniformity in high-dimensional spaces is feasible given the practical restriction in computational resources. In the second part of the thesis, we study the large-scale imbalanced problems. In many real world applications, a large problem may consist of subproblems with different degrees of difficulty and importance. In addition, the solution to each subproblem may contribute differently to the overall objective value of the final solution. When the computational budget is restricted, which is the case in many practical problems, investing the same portion of resources in optimizing each of these imbalanced subproblems is not the most efficient strategy. Therefore, we examine several ways to learn the contribution of each subproblem, and then, dynamically allocate the limited computational resources in solving each of them according to its contribution to the overall objective value of the final solution. To demonstrate the effectiveness of the proposed framework, we design a new set of 40 large-scale imbalanced problems and study the performance of some possible instances of the framework

    Evolving machine learning and deep learning models using evolutionary algorithms

    Get PDF
    Despite the great success in data mining, machine learning and deep learning models are yet subject to material obstacles when tackling real-life challenges, such as feature selection, initialization sensitivity, as well as hyperparameter optimization. The prevalence of these obstacles has severely constrained conventional machine learning and deep learning methods from fulfilling their potentials. In this research, three evolving machine learning and one evolving deep learning models are proposed to eliminate above bottlenecks, i.e. improving model initialization, enhancing feature representation, as well as optimizing model configuration, respectively, through hybridization between the advanced evolutionary algorithms and the conventional ML and DL methods. Specifically, two Firefly Algorithm based evolutionary clustering models are proposed to optimize cluster centroids in K-means and overcome initialization sensitivity as well as local stagnation. Secondly, a Particle Swarm Optimization based evolving feature selection model is developed for automatic identification of the most effective feature subset and reduction of feature dimensionality for tackling classification problems. Lastly, a Grey Wolf Optimizer based evolving Convolutional Neural Network-Long Short-Term Memory method is devised for automatic generation of the optimal topological and learning configurations for Convolutional Neural Network-Long Short-Term Memory networks to undertake multivariate time series prediction problems. Moreover, a variety of tailored search strategies are proposed to eliminate the intrinsic limitations embedded in the search mechanisms of the three employed evolutionary algorithms, i.e. the dictation of the global best signal in Particle Swarm Optimization, the constraint of the diagonal movement in Firefly Algorithm, as well as the acute contraction of search territory in Grey Wolf Optimizer, respectively. The remedy strategies include the diversification of guiding signals, the adaptive nonlinear search parameters, the hybrid position updating mechanisms, as well as the enhancement of population leaders. As such, the enhanced Particle Swarm Optimization, Firefly Algorithm, and Grey Wolf Optimizer variants are more likely to attain global optimality on complex search landscapes embedded in data mining problems, owing to the elevated search diversity as well as the achievement of advanced trade-offs between exploration and exploitation
    • …
    corecore