103 research outputs found
Goal-Directed Behavior under Variational Predictive Coding: Dynamic Organization of Visual Attention and Working Memory
Mental simulation is a critical cognitive function for goal-directed behavior
because it is essential for assessing actions and their consequences. When a
self-generated or externally specified goal is given, a sequence of actions
that is most likely to attain that goal is selected among other candidates via
mental simulation. Therefore, better mental simulation leads to better
goal-directed action planning. However, developing a mental simulation model is
challenging because it requires knowledge of self and the environment. The
current paper studies how adequate goal-directed action plans of robots can be
mentally generated by dynamically organizing top-down visual attention and
visual working memory. For this purpose, we propose a neural network model
based on variational Bayes predictive coding, where goal-directed action
planning is formulated by Bayesian inference of latent intentional space. Our
experimental results showed that cognitively meaningful competencies, such as
autonomous top-down attention to the robot end effector (its hand) as well as
dynamic organization of occlusion-free visual working memory, emerged.
Furthermore, our analysis of comparative experiments indicated that
introduction of visual working memory and the inference mechanism using
variational Bayes predictive coding significantly improve the performance in
planning adequate goal-directed actions
Adaptive Detrending for Accelerating the Training of Convolutional Recurrent Neural Networks
Convolutional recurrent neural networks (ConvRNNs) provide robust spatio-temporal information processing capabilities for contextual video recognition, but require extensive computation that slows down training. Inspired by detrending methods, we propose “adaptive detrending” (AD) for temporal normalization in order to accelerate the training of ConvRNNs, especially of convolutional gated recurrent unit (ConvGRU)
Adaptive detrending to accelerate convolutional gated recurrent unit training for contextual video recognition
Video image recognition has been extensively studied with rapid progress recently. However, most methods focus on short-term rather than long-term (contextual) video recognition. Convolutional recurrent neural networks (ConvRNNs) provide robust spatio-temporal information processing capabilities for contextual video recognition, but require extensive computation that slows down training. Inspired by normalization and detrending methods, in this paper we propose "adaptive detrending" (AD) for temporal normalization in order to accelerate the training of ConvRNNs, especially of convolutional gated recurrent unit (ConvGRU). For each neuron in a recurrent neural network (RNN), AD identifies the trending change within a sequence and subtracts it, removing the internal covariate shift. In experiments testing for contextual video recognition with ConvGRU, results show that (1) ConvGRU clearly outperforms feed-forward neural networks, (2) AD consistently and significantly accelerates training and improves generalization, (3) performance is further improved when AD is coupled with other normalization methods, and most importantly, (4) the more long-term contextual information is required, the more AD outperforms existing methods
Generating Goal-directed Visuomotor Plans with Supervised Learning using a Predictive Coding Deep Visuomotor Recurrent Neural Network
The ability to plan and visualize object manipulation in advance is vital for both humans and robots to smoothly reach a desired goal state. In this work, we demonstrate how our predictive coding based deep visuomotor recurrent neural network (PDVMRNN) can generate plans for a robot to manipulate objects based on a visual goal. A Tokyo Robotics Torobo Arm robot and a basic USB camera were used to record visuo-proprioceptive sequences of object manipulation. Although limitations in resolution resulted in lower success rates when plans were executed with the robot, our model is able to generate long predictions from novel start and goal states based on the learned patterns
Achieving Synergy in Cognitive Behavior of Humanoids via Deep Learning of Dynamic Visuo-Motor-Attentional Coordination
The current study examines how adequate coordination among different
cognitive processes including visual recognition, attention switching, action
preparation and generation can be developed via learning of robots by
introducing a novel model, the Visuo-Motor Deep Dynamic Neural Network (VMDNN).
The proposed model is built on coupling of a dynamic vision network, a motor
generation network, and a higher level network allocated on top of these two.
The simulation experiments using the iCub simulator were conducted for
cognitive tasks including visual object manipulation responding to human
gestures. The results showed that synergetic coordination can be developed via
iterative learning through the whole network when spatio-temporal hierarchy and
temporal one can be self-organized in the visual pathway and in the motor
pathway, respectively, such that the higher level can manipulate them with
abstraction.Comment: submitted to 2015 IEEE-RAS International Conference on Humanoid
Robot
- …