Search CORE

3 research outputs found

A Genetic Programming Approach to Designing Convolutional Neural Network Architectures

Author: Bergstra James
Bergstra James
Goodfellow Ian J.
Ioffe Sergey
Julian
Kingma Diederik
Lin Min
Mnih Volodymyr
Nair Vinod
Schaffer J. David
Snoek Jasper
Tokui Seiya
Zhang Richard
Publication venue
Publication date: 11/08/2017
Field of study

The convolutional neural network (CNN), which is one of the deep learning models, has seen much success in a variety of computer vision tasks. However, designing CNN architectures still requires expert knowledge and a lot of trial and error. In this paper, we attempt to automatically construct CNN architectures for an image classification task based on Cartesian genetic programming (CGP). In our method, we adopt highly functional modules, such as convolutional blocks and tensor concatenation, as the node functions in CGP. The CNN structure and connectivity represented by the CGP encoding method are optimized to maximize the validation accuracy. To evaluate the proposed method, we constructed a CNN architecture for the image classification task with the CIFAR-10 dataset. The experimental result shows that the proposed method can be used to automatically find the competitive CNN architecture compared with state-of-the-art models.Comment: This is the revised version of the GECCO 2017 paper. The code of our method is available at https://github.com/sg-nm/cgp-cn

arXiv.org e-Print Archive

Crossref

Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis

Author: Ben-Nun Tal
Hoefler Torsten
Publication venue
Publication date: 15/09/2018
Field of study

Deep Neural Networks (DNNs) are becoming an important tool in modern computing applications. Accelerating their training is a major challenge and techniques range from distributed algorithms to low-level circuit design. In this survey, we describe the problem from a theoretical perspective, followed by approaches for its parallelization. We present trends in DNN architectures and the resulting implications on parallelization strategies. We then review and model the different types of concurrency in DNNs: from the single operator, through parallelism in network inference and training, to distributed deep learning. We discuss asynchronous stochastic optimization, distributed system architectures, communication schemes, and neural architecture search. Based on those approaches, we extrapolate potential directions for parallelism in deep learning

arXiv.org e-Print Archive

Repository for Publications and Research Data

Efficient Evolution of Neural Networks

Author: Pagliuca Paolo
Publication venue: 'University of Plymouth'
Publication date: 01/01/2019
Field of study

This thesis addresses the study of evolutionary methods for the synthesis of neural network controllers. Chapter 1 introduces the research area, reviews the state of the art, discusses promising research directions, and presents the two major scientific objectives of the thesis. The first objective, which is covered in Chapter 2, is to verify the efficacy of some of the most promising neuro-evolutionary methods proposed in the literature, including two new methods that I elaborated. This has been made by designing extended version of the double-pole balancing problem, which can be used to more properly benchmark alternative algorithms, by studying the effect of critical parameters, and by conducting several series of comparative experiments. The obtained results indicate that some methods perform better with respect to all the considered criteria, i.e. performance, robustness to environmental variations and capability to scale-up to more complex problems. The second objective, which is targeted in Chapter 3, consists in the design of a new hybrid algorithm that combines evolution and learning by demonstration. The combination of these two processes is appealing since it potentially allows the adaptive agent to exploit a richer training feedback constituted by both a scalar performance objective (reinforcement signal or fitness measure) and a detailed description of a suitable behaviour (demonstration). The proposed method has been successfully evaluated on two qualitatively different robotic problems. Chapter 4 summarizes the results obtained and describes the major contributions of the thesis

Plymouth Electronic Archive and Research Library