Search CORE

157 research outputs found

Recurrent Models of Visual Attention

Author: Alex Graves
Google Deepmind
Koray Kavukcuoglu
Nicolas Heess
Volodymyr Mnih
Publication venue
Publication date: 24/06/2014
Field of study

Applying convolutional neural networks to large images is computationally expensive because the amount of computation scales linearly with the number of image pixels. We present a novel recurrent neural network model that is capable of extracting information from an image or video by adaptively selecting a sequence of regions or locations and only processing the selected regions at high resolution. Like convolutional neural networks, the proposed model has a degree of translation invariance built-in, but the amount of computation it performs can be controlled independently of the input image size. While the model is non-differentiable, it can be trained using reinforcement learning methods to learn task-specific policies. We evaluate our model on several image classification tasks, where it significantly outperforms a convolutional neural network baseline on cluttered images, and on a dynamic visual control problem, where it learns to track a simple object without an explicit training signal for doing so

arXiv.org e-Print Archive

CiteSeerX

Learning generative texture models with extended Fields-of-Experts

Author: Heess Nicolas
Hinton Geoffrey E.
Williams Christopher K. I.
Publication venue: 'British Machine Vision Association and Society for Pattern Recognition'
Publication date: 01/01/2009
Field of study

We evaluate the ability of the popular Field-of-Experts (FoE) to model structure in images. As a test case we focus on modeling synthetic and natural textures. We find that even for modeling single textures, the FoE provides insufficient flexibility to learn good generative models – it does not perform any better than the much simpler Gaussian FoE. We propose an extended version of the FoE (allowing for bimodal potentials) and demonstrate that this novel formulation, when trained with a better approximation of the likelihood gradient, gives rise to a more powerful generative model of specific visual structure that produces significantly better results for the texture task

CiteSeerX

Crossref

Edinburgh Research Explorer

Learning and Transfer of Modulated Locomotor Controllers

Author: Heess Nicolas
Lillicrap Timothy
Riedmiller Martin
Silver David
Tassa Yuval
Wayne Greg
Publication venue
Publication date: 17/10/2016
Field of study

We study a novel architecture and training procedure for locomotion tasks. A high-frequency, low-level "spinal" network with access to proprioceptive sensors learns sensorimotor primitives by training on simple tasks. This pre-trained module is fixed and connected to a low-frequency, high-level "cortical" network, with access to all sensors, which drives behavior by modulating the inputs to the spinal network. Where a monolithic end-to-end architecture fails completely, learning with a pre-trained spinal module succeeds at multiple high-level tasks, and enables the effective exploration required to learn from sparse rewards. We test our proposed architecture on three simulated bodies: a 16-dimensional swimming snake, a 20-dimensional quadruped, and a 54-dimensional humanoid. Our results are illustrated in the accompanying video at https://youtu.be/sboPYvhpraQComment: Supplemental video available at https://youtu.be/sboPYvhpra

arXiv.org e-Print Archive

UCL Discovery