13,953 research outputs found
A Genetic Programming Approach to Designing Convolutional Neural Network Architectures
The convolutional neural network (CNN), which is one of the deep learning
models, has seen much success in a variety of computer vision tasks. However,
designing CNN architectures still requires expert knowledge and a lot of trial
and error. In this paper, we attempt to automatically construct CNN
architectures for an image classification task based on Cartesian genetic
programming (CGP). In our method, we adopt highly functional modules, such as
convolutional blocks and tensor concatenation, as the node functions in CGP.
The CNN structure and connectivity represented by the CGP encoding method are
optimized to maximize the validation accuracy. To evaluate the proposed method,
we constructed a CNN architecture for the image classification task with the
CIFAR-10 dataset. The experimental result shows that the proposed method can be
used to automatically find the competitive CNN architecture compared with
state-of-the-art models.Comment: This is the revised version of the GECCO 2017 paper. The code of our
method is available at https://github.com/sg-nm/cgp-cn
Deep Learning: Our Miraculous Year 1990-1991
In 2020, we will celebrate that many of the basic ideas behind the deep
learning revolution were published three decades ago within fewer than 12
months in our "Annus Mirabilis" or "Miraculous Year" 1990-1991 at TU Munich.
Back then, few people were interested, but a quarter century later, neural
networks based on these ideas were on over 3 billion devices such as
smartphones, and used many billions of times per day, consuming a significant
fraction of the world's compute.Comment: 37 pages, 188 references, based on work of 4 Oct 201
Deep Neuroevolution of Recurrent and Discrete World Models
Neural architectures inspired by our own human cognitive system, such as the
recently introduced world models, have been shown to outperform traditional
deep reinforcement learning (RL) methods in a variety of different domains.
Instead of the relatively simple architectures employed in most RL experiments,
world models rely on multiple different neural components that are responsible
for visual information processing, memory, and decision-making. However, so far
the components of these models have to be trained separately and through a
variety of specialized training methods. This paper demonstrates the surprising
finding that models with the same precise parts can be instead efficiently
trained end-to-end through a genetic algorithm (GA), reaching a comparable
performance to the original world model by solving a challenging car racing
task. An analysis of the evolved visual and memory system indicates that they
include a similar effective representation to the system trained through
gradient descent. Additionally, in contrast to gradient descent methods that
struggle with discrete variables, GAs also work directly with such
representations, opening up opportunities for classical planning in latent
space. This paper adds additional evidence on the effectiveness of deep
neuroevolution for tasks that require the intricate orchestration of multiple
components in complex heterogeneous architectures
Large Scale Evolution of Convolutional Neural Networks Using Volunteer Computing
This work presents a new algorithm called evolutionary exploration of
augmenting convolutional topologies (EXACT), which is capable of evolving the
structure of convolutional neural networks (CNNs). EXACT is in part modeled
after the neuroevolution of augmenting topologies (NEAT) algorithm, with
notable exceptions to allow it to scale to large scale distributed computing
environments and evolve networks with convolutional filters. In addition to
multithreaded and MPI versions, EXACT has been implemented as part of a BOINC
volunteer computing project, allowing large scale evolution. During a period of
two months, over 4,500 volunteered computers on the Citizen Science Grid
trained over 120,000 CNNs and evolved networks reaching 98.32% test data
accuracy on the MNIST handwritten digits dataset. These results are even
stronger as the backpropagation strategy used to train the CNNs was fairly
rudimentary (ReLU units, L2 regularization and Nesterov momentum) and these
were initial test runs done without refinement of the backpropagation
hyperparameters. Further, the EXACT evolutionary strategy is independent of the
method used to train the CNNs, so they could be further improved by advanced
techniques like elastic distortions, pretraining and dropout. The evolved
networks are also quite interesting, showing "organic" structures and
significant differences from standard human designed architectures.Comment: 17 pages, 13 figures. Submitted to the 2017 Genetic and Evolutionary
Computation Conference (GECCO 2017
- …