13,333 research outputs found
Assessing hyper parameter optimization and speedup for convolutional neural networks
The increased processing power of graphical processing units (GPUs) and the availability of large image datasets has fostered a renewed interest in extracting semantic information from images. Promising results for complex image categorization problems have been achieved using deep learning, with neural networks comprised of many layers. Convolutional neural networks (CNN) are one such architecture which provides more opportunities for image classification. Advances in CNN enable the development of training models using large labelled image datasets, but the hyper parameters need to be specified, which is challenging and complex due to the large number of parameters. A substantial amount of computational power and processing time is required to determine the optimal hyper parameters to define a model yielding good results. This article provides a survey of the hyper parameter search and optimization methods for CNN architectures
The Kalai-Smorodinski solution for many-objective Bayesian optimization
An ongoing aim of research in multiobjective Bayesian optimization is to
extend its applicability to a large number of objectives. While coping with a
limited budget of evaluations, recovering the set of optimal compromise
solutions generally requires numerous observations and is less interpretable
since this set tends to grow larger with the number of objectives. We thus
propose to focus on a specific solution originating from game theory, the
Kalai-Smorodinsky solution, which possesses attractive properties. In
particular, it ensures equal marginal gains over all objectives. We further
make it insensitive to a monotonic transformation of the objectives by
considering the objectives in the copula space. A novel tailored algorithm is
proposed to search for the solution, in the form of a Bayesian optimization
algorithm: sequential sampling decisions are made based on acquisition
functions that derive from an instrumental Gaussian process prior. Our approach
is tested on four problems with respectively four, six, eight, and nine
objectives. The method is available in the Rpackage GPGame available on CRAN at
https://cran.r-project.org/package=GPGame
Deep Neuroevolution of Recurrent and Discrete World Models
Neural architectures inspired by our own human cognitive system, such as the
recently introduced world models, have been shown to outperform traditional
deep reinforcement learning (RL) methods in a variety of different domains.
Instead of the relatively simple architectures employed in most RL experiments,
world models rely on multiple different neural components that are responsible
for visual information processing, memory, and decision-making. However, so far
the components of these models have to be trained separately and through a
variety of specialized training methods. This paper demonstrates the surprising
finding that models with the same precise parts can be instead efficiently
trained end-to-end through a genetic algorithm (GA), reaching a comparable
performance to the original world model by solving a challenging car racing
task. An analysis of the evolved visual and memory system indicates that they
include a similar effective representation to the system trained through
gradient descent. Additionally, in contrast to gradient descent methods that
struggle with discrete variables, GAs also work directly with such
representations, opening up opportunities for classical planning in latent
space. This paper adds additional evidence on the effectiveness of deep
neuroevolution for tasks that require the intricate orchestration of multiple
components in complex heterogeneous architectures
- …