233 research outputs found

    Goal-Directed Behavior under Variational Predictive Coding: Dynamic Organization of Visual Attention and Working Memory

    Full text link
    Mental simulation is a critical cognitive function for goal-directed behavior because it is essential for assessing actions and their consequences. When a self-generated or externally specified goal is given, a sequence of actions that is most likely to attain that goal is selected among other candidates via mental simulation. Therefore, better mental simulation leads to better goal-directed action planning. However, developing a mental simulation model is challenging because it requires knowledge of self and the environment. The current paper studies how adequate goal-directed action plans of robots can be mentally generated by dynamically organizing top-down visual attention and visual working memory. For this purpose, we propose a neural network model based on variational Bayes predictive coding, where goal-directed action planning is formulated by Bayesian inference of latent intentional space. Our experimental results showed that cognitively meaningful competencies, such as autonomous top-down attention to the robot end effector (its hand) as well as dynamic organization of occlusion-free visual working memory, emerged. Furthermore, our analysis of comparative experiments indicated that introduction of visual working memory and the inference mechanism using variational Bayes predictive coding significantly improve the performance in planning adequate goal-directed actions

    Generating Goal-directed Visuomotor Plans with Supervised Learning using a Predictive Coding Deep Visuomotor Recurrent Neural Network

    Get PDF
    The ability to plan and visualize object manipulation in advance is vital for both humans and robots to smoothly reach a desired goal state. In this work, we demonstrate how our predictive coding based deep visuomotor recurrent neural network (PDVMRNN) can generate plans for a robot to manipulate objects based on a visual goal. A Tokyo Robotics Torobo Arm robot and a basic USB camera were used to record visuo-proprioceptive sequences of object manipulation. Although limitations in resolution resulted in lower success rates when plans were executed with the robot, our model is able to generate long predictions from novel start and goal states based on the learned patterns

    Goal-Directed Planning for Habituated Agents by Active Inference Using a Variational Recurrent Neural Network

    Get PDF
    It is crucial to ask how agents can achieve goals by generating action plans using only partial models of the world acquired through habituated sensory-motor experiences. Although many existing robotics studies use a forward model framework, there are generalization issues with high degrees of freedom. The current study shows that the predictive coding (PC) and active inference (AIF) frameworks, which employ a generative model, can develop better generalization by learning a prior distribution in a low dimensional latent state space representing probabilistic structures extracted from well habituated sensory-motor trajectories. In our proposed model, learning is carried out by inferring optimal latent variables as well as synaptic weights for maximizing the evidence lower bound, while goal-directed planning is accomplished by inferring latent variables for maximizing the estimated lower bound. Our proposed model was evaluated with both simple and complex robotic tasks in simulation, which demonstrated sufficient generalization in learning with limited training data by setting an intermediate value for a regularization coefficient. Furthermore, comparative simulation results show that the proposed model outperforms a conventional forward model in goal-directed planning, due to the learned prior confining the search of motor plans within the range of habituated trajectories.Comment: 30 pages, 19 figure

    Emergence of Content-Agnostic Information Processing by a Robot Using Active Inference, Visual Attention, Working Memory, and Planning

    Get PDF
    Generalization by learning is an essential cognitive competency for humans. For example, we can manipulate even unfamiliar objects and can generate mental images before enacting a preplan. How is this possible? Our study investigated this problem by revisiting our previous study (Jung, Matsumoto, & Tani, 2019), which examined the problem of vision-based, goal-directed planning by robots performing a task of block stacking. By extending the previous study, our work introduces a large network comprising dynamically interacting submodules, including visual working memory (VWMs), a visual attention module, and an executive network. The executive network predicts motor signals, visual images, and various controls for attention, as well as masking of visual information. The most significant difference from the previous study is that our current model contains an additional VWM. The entire network is trained by using predictive coding and an optimal visuomotor plan to achieve a given goal state is inferred using active inference. Results indicate that our current model performs significantly better than that used in Jung et al. (2019), especially when manipulating blocks with unlearned colors and textures. Simulation results revealed that the observed generalization was achieved because content-agnostic information processing developed through synergistic interaction between the second VWM and other modules during the course of learning, in which memorizing image contents and transforming them are dissociated. This letter verifies this claim by conducting both qualitative and quantitative analysis of simulation results

    ToyArchitecture: Unsupervised Learning of Interpretable Models of the World

    Full text link
    Research in Artificial Intelligence (AI) has focused mostly on two extremes: either on small improvements in narrow AI domains, or on universal theoretical frameworks which are usually uncomputable, incompatible with theories of biological intelligence, or lack practical implementations. The goal of this work is to combine the main advantages of the two: to follow a big picture view, while providing a particular theory and its implementation. In contrast with purely theoretical approaches, the resulting architecture should be usable in realistic settings, but also form the core of a framework containing all the basic mechanisms, into which it should be easier to integrate additional required functionality. In this paper, we present a novel, purposely simple, and interpretable hierarchical architecture which combines multiple different mechanisms into one system: unsupervised learning of a model of the world, learning the influence of one's own actions on the world, model-based reinforcement learning, hierarchical planning and plan execution, and symbolic/sub-symbolic integration in general. The learned model is stored in the form of hierarchical representations with the following properties: 1) they are increasingly more abstract, but can retain details when needed, and 2) they are easy to manipulate in their local and symbolic-like form, thus also allowing one to observe the learning process at each level of abstraction. On all levels of the system, the representation of the data can be interpreted in both a symbolic and a sub-symbolic manner. This enables the architecture to learn efficiently using sub-symbolic methods and to employ symbolic inference.Comment: Revision: changed the pdftitl

    A PROBABILISTIC APPROACH TO THE CONSTRUCTION OF A MULTIMODAL AFFECT SPACE

    Get PDF
    Understanding affective signals from others is crucial for both human-human and human-agent interaction. The automatic analysis of emotion is by and large addressed as a pattern recognition problem which grounds in early psychological theories of emotion. Suitable features are first extracted and then used as input to classification (discrete emotion recognition) or regression (continuous affect detection). In this thesis, differently from many computational models in the literature, we draw on a simulationist approach to the analysis of facially displayed emotions - e.g., in the course of a face-to-face interaction between an expresser and an observer. At the heart of such perspective lies the enactment of the perceived emotion in the observer. We propose a probabilistic framework based on a deep latent representation of a continuous affect space, which can be exploited for both the estimation and the enactment of affective states in a multimodal space. Namely, we consider the observed facial expression together with physiological activations driven by internal autonomic activity. The rationale behind the approach lies in the large body of evidence from affective neuroscience showing that when we observe emotional facial expressions, we react with congruent facial mimicry. Further, in more complex situations, affect understanding is likely to rely on a comprehensive representation grounding the reconstruction of the state of the body associated with the displayed emotion. We show that our approach can address such problems in a unified and principled perspective, thus avoiding ad hoc heuristics while minimising learning efforts. Moreover, our model improves the inferred belief through the adoption of an inner loop of measurements and predictions within the central affect state-space, that realise the dynamics of the affect enactment. Results so far achieved have been obtained by adopting two publicly available multimodal corpora

    Coordinating with the Future: The Anticipatory Nature of Representation

    Full text link

    DEVELOPMENT OF A CEREBELLAR MEAN FIELD MODEL: THE THEORETICAL FRAMEWORK, THE IMPLEMENTATION AND THE FIRST APPLICATION

    Get PDF
    Brain modeling constantly evolves to improve the accuracy of the simulated brain dynamics with the ambitious aim to build a digital twin of the brain. Specific models tuned on brain regions specific features empower the brain simulations introducing bottom-up physiology properties into data-driven simulators. Despite the cerebellum contains 80 % of the neurons and is deeply involved in a wide range of functions, from sensorimotor to cognitive ones, a specific cerebellar model is still missing. Furthermore, its quasi-crystalline multi-layer circuitry deeply differs from the cerebral cortical one, therefore is hard to imagine a unique general model suitable for the realistic simulation of both cerebellar and cerebral cortex. The present thesis tackles the challenge of developing a specific model for the cerebellum. Specifically, multi-neuron multi-layer mean field (MF) model of the cerebellar network, including Granule Cells, Golgi Cells, Molecular Layer Interneurons, and Purkinje Cells, was implemented, and validated against experimental data and the corresponding spiking neural network microcircuit model. The cerebellar MF model was built using a system of interdependent equations, where the single neuronal populations and topological parameters were captured by neuron-specific inter- dependent Transfer Functions. The model time resolution was optimized using Local Field Potentials recorded experimentally with high-density multielectrode array from acute mouse cerebellar slices. The present MF model satisfactorily captured the average discharge of different microcircuit neuronal populations in response to various input patterns and was able to predict the changes in Purkinje Cells firing patterns occurring in specific behavioral conditions: cortical plasticity mapping, which drives learning in associative tasks, and Molecular Layer Interneurons feed-forward inhibition, which controls Purkinje Cells activity patterns. The cerebellar multi-layer MF model thus provides a computationally efficient tool that will allow to investigate the causal relationship between microscopic neuronal properties and ensemble brain activity in health and pathological conditions. Furthermore, preliminary attempts to simulate a pathological cerebellum were done in the perspective of introducing our multi-layer cerebellar MF model in whole-brain simulators to realize patient-specific treatments, moving ahead towards personalized medicine. Two preliminary works assessed the relevant impact of the cerebellum on whole-brain dynamics and its role in modulating complex responses in causal connected cerebral regions, confirming that a specific model is required to further investigate the cerebellum-on- cerebrum influence. The framework presented in this thesis allows to develop a multi-layer MF model depicting the features of a specific brain region (e.g., cerebellum, basal ganglia), in order to define a general strategy to build up a pool of biology grounded MF models for computationally feasible simulations. Interconnected bottom-up MF models integrated in large-scale simulators would capture specific features of different brain regions, while the applications of a virtual brain would have a substantial impact on the reality ranging from the characterization of neurobiological processes, subject-specific preoperative plans, and development of neuro-prosthetic devices
    corecore