636 research outputs found

    Cooperative Coevolution for Non-Separable Large-Scale Black-Box Optimization: Convergence Analyses and Distributed Accelerations

    Full text link
    Given the ubiquity of non-separable optimization problems in real worlds, in this paper we analyze and extend the large-scale version of the well-known cooperative coevolution (CC), a divide-and-conquer optimization framework, on non-separable functions. First, we reveal empirical reasons of why decomposition-based methods are preferred or not in practice on some non-separable large-scale problems, which have not been clearly pointed out in many previous CC papers. Then, we formalize CC to a continuous game model via simplification, but without losing its essential property. Different from previous evolutionary game theory for CC, our new model provides a much simpler but useful viewpoint to analyze its convergence, since only the pure Nash equilibrium concept is needed and more general fitness landscapes can be explicitly considered. Based on convergence analyses, we propose a hierarchical decomposition strategy for better generalization, as for any decomposition there is a risk of getting trapped into a suboptimal Nash equilibrium. Finally, we use powerful distributed computing to accelerate it under the multi-level learning framework, which combines the fine-tuning ability from decomposition with the invariance property of CMA-ES. Experiments on a set of high-dimensional functions validate both its search performance and scalability (w.r.t. CPU cores) on a clustering computing platform with 400 CPU cores

    A Parallel Divide-and-Conquer based Evolutionary Algorithm for Large-scale Optimization

    Full text link
    Large-scale optimization problems that involve thousands of decision variables have extensively arisen from various industrial areas. As a powerful optimization tool for many real-world applications, evolutionary algorithms (EAs) fail to solve the emerging large-scale problems both effectively and efficiently. In this paper, we propose a novel Divide-and-Conquer (DC) based EA that can not only produce high-quality solution by solving sub-problems separately, but also highly utilizes the power of parallel computing by solving the sub-problems simultaneously. Existing DC-based EAs that were deemed to enjoy the same advantages of the proposed algorithm, are shown to be practically incompatible with the parallel computing scheme, unless some trade-offs are made by compromising the solution quality.Comment: 12 pages, 0 figure

    Knowledge management overview of feature selection problem in high-dimensional financial data: Cooperative co-evolution and Map Reduce perspectives

    Get PDF
    The term big data characterizes the massive amounts of data generation by the advanced technologies in different domains using 4Vs volume, velocity, variety, and veracity-to indicate the amount of data that can only be processed via computationally intensive analysis, the speed of their creation, the different types of data, and their accuracy. High-dimensional financial data, such as time-series and space-Time data, contain a large number of features (variables) while having a small number of samples, which are used to measure various real-Time business situations for financial organizations. Such datasets are normally noisy, and complex correlations may exist between their features, and many domains, including financial, lack the al analytic tools to mine the data for knowledge discovery because of the high-dimensionality. Feature selection is an optimization problem to find a minimal subset of relevant features that maximizes the classification accuracy and reduces the computations. Traditional statistical-based feature selection approaches are not adequate to deal with the curse of dimensionality associated with big data. Cooperative co-evolution, a meta-heuristic algorithm and a divide-And-conquer approach, decomposes high-dimensional problems into smaller sub-problems. Further, MapReduce, a programming model, offers a ready-To-use distributed, scalable, and fault-Tolerant infrastructure for parallelizing the developed algorithm. This article presents a knowledge management overview of evolutionary feature selection approaches, state-of-The-Art cooperative co-evolution and MapReduce-based feature selection techniques, and future research directions

    PyPop7: A Pure-Python Library for Population-Based Black-Box Optimization

    Full text link
    In this paper, we present a pure-Python open-source library, called PyPop7, for black-box optimization (BBO). It provides a unified and modular interface for more than 60 versions and variants of different black-box optimization algorithms, particularly population-based optimizers, which can be classified into 12 popular families: Evolution Strategies (ES), Natural Evolution Strategies (NES), Estimation of Distribution Algorithms (EDA), Cross-Entropy Method (CEM), Differential Evolution (DE), Particle Swarm Optimizer (PSO), Cooperative Coevolution (CC), Simulated Annealing (SA), Genetic Algorithms (GA), Evolutionary Programming (EP), Pattern Search (PS), and Random Search (RS). It also provides many examples, interesting tutorials, and full-fledged API documentations. Through this new library, we expect to provide a well-designed platform for benchmarking of optimizers and promote their real-world applications, especially for large-scale BBO. Its source code and documentations are available at https://github.com/Evolutionary-Intelligence/pypop and https://pypop.readthedocs.io/en/latest, respectively.Comment: 5 page

    Cooperative co-evolution for feature selection in big data with random feature grouping

    Get PDF
    © 2020, The Author(s). A massive amount of data is generated with the evolution of modern technologies. This high-throughput data generation results in Big Data, which consist of many features (attributes). However, irrelevant features may degrade the classification performance of machine learning (ML) algorithms. Feature selection (FS) is a technique used to select a subset of relevant features that represent the dataset. Evolutionary algorithms (EAs) are widely used search strategies in this domain. A variant of EAs, called cooperative co-evolution (CC), which uses a divide-and-conquer approach, is a good choice for optimization problems. The existing solutions have poor performance because of some limitations, such as not considering feature interactions, dealing with only an even number of features, and decomposing the dataset statically. In this paper, a novel random feature grouping (RFG) has been introduced with its three variants to dynamically decompose Big Data datasets and to ensure the probability of grouping interacting features into the same subcomponent. RFG can be used in CC-based FS processes, hence called Cooperative Co-Evolutionary-Based Feature Selection with Random Feature Grouping (CCFSRFG). Experiment analysis was performed using six widely used ML classifiers on seven different datasets from the UCI ML repository and Princeton University Genomics repository with and without FS. The experimental results indicate that in most cases [i.e., with naïve Bayes (NB), support vector machine (SVM), k-Nearest Neighbor (k-NN), J48, and random forest (RF)] the proposed CCFSRFG-1 outperforms an existing solution (a CC-based FS, called CCEAFS) and CCFSRFG-2, and also when using all features in terms of accuracy, sensitivity, and specificity

    Treasure hunt : a framework for cooperative, distributed parallel optimization

    Get PDF
    Orientador: Prof. Dr. Daniel WeingaertnerCoorientadora: Profa. Dra. Myriam Regattieri DelgadoTese (doutorado) - Universidade Federal do Paraná, Setor de Ciências Exatas, Programa de Pós-Graduação em Informática. Defesa : Curitiba, 27/05/2019Inclui referências: p. 18-20Área de concentração: Ciência da ComputaçãoResumo: Este trabalho propõe um framework multinível chamado Treasure Hunt, que é capaz de distribuir algoritmos de busca independentes para um grande número de nós de processamento. Com o objetivo de obter uma convergência conjunta entre os nós, este framework propõe um mecanismo de direcionamento que controla suavemente a cooperação entre múltiplas instâncias independentes do Treasure Hunt. A topologia em árvore proposta pelo Treasure Hunt garante a rápida propagação da informação pelos nós, ao mesmo tempo em que provê simutaneamente explorações (pelos nós-pai) e intensificações (pelos nós-filho), em vários níveis de granularidade, independentemente do número de nós na árvore. O Treasure Hunt tem boa tolerância à falhas e está parcialmente preparado para uma total tolerância à falhas. Como parte dos métodos desenvolvidos durante este trabalho, um método automatizado de Particionamento Iterativo foi proposto para controlar o balanceamento entre explorações e intensificações ao longo da busca. Uma Modelagem de Estabilização de Convergência para operar em modo Online também foi proposto, com o objetivo de encontrar pontos de parada com bom custo/benefício para os algoritmos de otimização que executam dentro das instâncias do Treasure Hunt. Experimentos em benchmarks clássicos, aleatórios e de competição, de vários tamanhos e complexidades, usando os algoritmos de busca PSO, DE e CCPSO2, mostram que o Treasure Hunt melhora as características inerentes destes algoritmos de busca. O Treasure Hunt faz com que os algoritmos de baixa performance se tornem comparáveis aos de boa performance, e os algoritmos de boa performance possam estender seus limites até problemas maiores. Experimentos distribuindo instâncias do Treasure Hunt, em uma rede cooperativa de até 160 processos, demonstram a escalabilidade robusta do framework, apresentando melhoras nos resultados mesmo quando o tempo de processamento é fixado (wall-clock) para todas as instâncias distribuídas do Treasure Hunt. Resultados demonstram que o mecanismo de amostragem fornecido pelo Treasure Hunt, aliado à maior cooperação entre as múltiplas populações em evolução, reduzem a necessidade de grandes populações e de algoritmos de busca complexos. Isto é especialmente importante em problemas de mundo real que possuem funções de fitness muito custosas. Palavras-chave: Inteligência artificial. Métodos de otimização. Algoritmos distribuídos. Modelagem de convergência. Alta dimensionalidade.Abstract: This work proposes a multilevel framework called Treasure Hunt, which is capable of distributing independent search algorithms to a large number of processing nodes. Aiming to obtain joint convergences between working nodes, Treasure Hunt proposes a driving mechanism that smoothly controls the cooperation between the multiple independent Treasure Hunt instances. The tree topology proposed by Treasure Hunt ensures quick propagation of information, while providing simultaneous explorations (by parents) and exploitations (by children), on several levels of granularity, regardless the number of nodes in the tree. Treasure Hunt has good fault tolerance and is partially prepared to full fault tolerance. As part of the methods developed during this work, an automated Iterative Partitioning method is proposed to control the balance between exploration and exploitation as the search progress. A Convergence Stabilization Modeling to operate in Online mode is also proposed, aiming to find good cost/benefit stopping points for the optimization algorithms running within the Treasure Hunt instances. Experiments on classic, random and competition benchmarks of various sizes and complexities, using the search algorithms PSO, DE and CCPSO2, show that Treasure Hunt boosts the inherent characteristics of these search algorithms. Treasure Hunt makes algorithms with poor performances to become comparable to good ones, and algorithms with good performances to be capable of extending their limits to larger problems. Experiments distributing Treasure Hunt instances in a cooperative network up to 160 processes show the robust scaling of the framework, presenting improved results even when fixing a wall-clock time for the instances. Results show that the sampling mechanism provided by Treasure Hunt, allied to the increased cooperation between multiple evolving populations, reduce the need for large population sizes and complex search algorithms. This is specially important on real-world problems with time-consuming fitness functions. Keywords: Artificial intelligence. Optimization methods. Distributed algorithms. Convergence modeling. High dimensionality
    corecore