Search CORE

47,514 research outputs found

Assessing hyper parameter optimization and speedup for convolutional neural networks

Author: A.Krizhevsky
D. L.Tutorial
E.Bochinski
E.Real
J.Bergstra
J.Deng
K.He
L.Xie
N.Srivastava
S.Ioffe
T.Domhan
W. Y.Lee
Z.Zhong
Publication venue: 'IGI Global'
Publication date: 01/01/2020
Field of study

The increased processing power of graphical processing units (GPUs) and the availability of large image datasets has fostered a renewed interest in extracting semantic information from images. Promising results for complex image categorization problems have been achieved using deep learning, with neural networks comprised of many layers. Convolutional neural networks (CNN) are one such architecture which provides more opportunities for image classification. Advances in CNN enable the development of training models using large labelled image datasets, but the hyper parameters need to be specified, which is challenging and complex due to the large number of parameters. A substantial amount of computational power and processing time is required to determine the optimal hyper parameters to define a model yielding good results. This article provides a survey of the hyper parameter search and optimization methods for CNN architectures

LSBU Research Open

Crossref

ResearchOnline@GCU

Hierarchical Decomposition of Nonlinear Dynamics and Control for System Identification and Policy Distillation

Author: Abdulsamad Hany
Peters Jan
Publication venue
Publication date: 01/01/2020
Field of study

The control of nonlinear dynamical systems remains a major challenge for autonomous agents. Current trends in reinforcement learning (RL) focus on complex representations of dynamics and policies, which have yielded impressive results in solving a variety of hard control tasks. However, this new sophistication and extremely over-parameterized models have come with the cost of an overall reduction in our ability to interpret the resulting policies. In this paper, we take inspiration from the control community and apply the principles of hybrid switching systems in order to break down complex dynamics into simpler components. We exploit the rich representational power of probabilistic graphical models and derive an expectation-maximization (EM) algorithm for learning a sequence model to capture the temporal structure of the data and automatically decompose nonlinear dynamics into stochastic switching linear dynamical systems. Moreover, we show how this framework of switching models enables extracting hierarchies of Markovian and auto-regressive locally linear controllers from nonlinear experts in an imitation learning scenario.Comment: 2nd Annual Conference on Learning for Dynamics and Contro

arXiv.org e-Print Archive

MPG.PuRe