563 research outputs found

    Limited Evaluation Cooperative Co-evolutionary Differential Evolution for Large-scale Neuroevolution

    Get PDF
    Many real-world control and classification tasks involve a large number of features. When artificial neural networks (ANNs) are used for modeling these tasks, the network architectures tend to be large. Neuroevolution is an effective approach for optimizing ANNs; however, there are two bottlenecks that make their application challenging in case of high-dimensional networks using direct encoding. First, classic evolutionary algorithms tend not to scale well for searching large parameter spaces; second, the network evaluation over a large number of training instances is in general time-consuming. In this work, we propose an approach called the Limited Evaluation Cooperative Co-evolutionary Differential Evolution algorithm (LECCDE) to optimize high-dimensional ANNs. The proposed method aims to optimize the pre-synaptic weights of each post-synaptic neuron in different subpopulations using a Cooperative Co-evolutionary Differential Evolution algorithm, and employs a limited evaluation scheme where fitness evaluation is performed on a relatively small number of training instances based on fitness inheritance. We test LECCDE on three datasets with various sizes, and our results show that cooperative co-evolution significantly improves the test error comparing to standard Differential Evolution, while the limited evaluation scheme facilitates a significant reduction in computing time

    Safe Mutations for Deep and Recurrent Neural Networks through Output Gradients

    Full text link
    While neuroevolution (evolving neural networks) has a successful track record across a variety of domains from reinforcement learning to artificial life, it is rarely applied to large, deep neural networks. A central reason is that while random mutation generally works in low dimensions, a random perturbation of thousands or millions of weights is likely to break existing functionality, providing no learning signal even if some individual weight changes were beneficial. This paper proposes a solution by introducing a family of safe mutation (SM) operators that aim within the mutation operator itself to find a degree of change that does not alter network behavior too much, but still facilitates exploration. Importantly, these SM operators do not require any additional interactions with the environment. The most effective SM variant capitalizes on the intriguing opportunity to scale the degree of mutation of each individual weight according to the sensitivity of the network's outputs to that weight, which requires computing the gradient of outputs with respect to the weights (instead of the gradient of error, as in conventional deep learning). This safe mutation through gradients (SM-G) operator dramatically increases the ability of a simple genetic algorithm-based neuroevolution method to find solutions in high-dimensional domains that require deep and/or recurrent neural networks (which tend to be particularly brittle to mutation), including domains that require processing raw pixels. By improving our ability to evolve deep neural networks, this new safer approach to mutation expands the scope of domains amenable to neuroevolution

    Training binary neural networks without floating point precision

    Full text link
    The main goal of this work is to improve the efficiency of training binary neural networks, which are low latency and low energy networks. The main contribution of this work is the proposal of two solutions comprised of topology changes and strategy training that allow the network to achieve near the state-of-the-art performance and efficient training. The time required for training and the memory required in the process are two factors that contribute to efficient training.Comment: 74 pages, Master's thesi

    Correspondence between neuroevolution and gradient descent

    Full text link
    We show analytically that training a neural network by stochastic mutation or "neuroevolution" of its weights is equivalent, in the limit of small mutations, to gradient descent on the loss function in the presence of Gaussian white noise. Averaged over independent realizations of the learning process, neuroevolution is equivalent to gradient descent on the loss function. We use numerical simulation to show that this correspondence can be observed for finite mutations. Our results provide a connection between two distinct types of neural-network training, and provide justification for the empirical success of neuroevolution
    • …
    corecore