1,476 research outputs found

    Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networks

    Get PDF
    Recurrent neural networks (RNNs) for reinforcement learning (RL) have shown distinct advantages, e.g., solving memory-dependent tasks and meta-learning. However, little effort has been spent on improving RNN architectures and on understanding the underlying neural mechanisms for performance gain. In this paper, we propose a novel, multiple-timescale, stochastic RNN for RL. Empirical results show that the network can autonomously learn to abstract sub-goals and can self-develop an action hierarchy using internal dynamics in a challenging continuous control task. Furthermore, we show that the self-developed compositionality of the network enhances faster re-learning when adapting to a new task that is a re-composition of previously learned sub-goals, than when starting from scratch. We also found that improved performance can be achieved when neural activities are subject to stochastic rather than deterministic dynamics

    Emergence of Functional Hierarchy in a Multiple Timescale Neural Network Model: A Humanoid Robot Experiment

    Get PDF
    It is generally thought that skilled behavior in human beings results from a functional hierarchy of the motor control system, within which reusable motor primitives are flexibly integrated into various sensori-motor sequence patterns. The underlying neural mechanisms governing the way in which continuous sensori-motor flows are segmented into primitives and the way in which series of primitives are integrated into various behavior sequences have, however, not yet been clarified. In earlier studies, this functional hierarchy has been realized through the use of explicit hierarchical structure, with local modules representing motor primitives in the lower level and a higher module representing sequences of primitives switched via additional mechanisms such as gate-selecting. When sequences contain similarities and overlap, however, a conflict arises in such earlier models between generalization and segmentation, induced by this separated modular structure. To address this issue, we propose a different type of neural network model. The current model neither makes use of separate local modules to represent primitives nor introduces explicit hierarchical structure. Rather than forcing architectural hierarchy onto the system, functional hierarchy emerges through a form of self-organization that is based on two distinct types of neurons, each with different time properties (“multiple timescales”). Through the introduction of multiple timescales, continuous sequences of behavior are segmented into reusable primitives, and the primitives, in turn, are flexibly integrated into novel sequences. In experiments, the proposed network model, coordinating the physical body of a humanoid robot through high-dimensional sensori-motor control, also successfully situated itself within a physical environment. Our results suggest that it is not only the spatial connections between neurons but also the timescales of neural activity that act as important mechanisms leading to functional hierarchy in neural systems

    The Timing of Vision – How Neural Processing Links to Different Temporal Dynamics

    Get PDF
    In this review, we describe our recent attempts to model the neural correlates of visual perception with biologically inspired networks of spiking neurons, emphasizing the dynamical aspects. Experimental evidence suggests distinct processing modes depending on the type of task the visual system is engaged in. A first mode, crucial for object recognition, deals with rapidly extracting the glimpse of a visual scene in the first 100 ms after its presentation. The promptness of this process points to mainly feedforward processing, which relies on latency coding, and may be shaped by spike timing-dependent plasticity (STDP). Our simulations confirm the plausibility and efficiency of such a scheme. A second mode can be engaged whenever one needs to perform finer perceptual discrimination through evidence accumulation on the order of 400 ms and above. Here, our simulations, together with theoretical considerations, show how predominantly local recurrent connections and long neural time-constants enable the integration and build-up of firing rates on this timescale. In particular, we review how a non-linear model with attractor states induced by strong recurrent connectivity provides straightforward explanations for several recent experimental observations. A third mode, involving additional top-down attentional signals, is relevant for more complex visual scene processing. In the model, as in the brain, these top-down attentional signals shape visual processing by biasing the competition between different pools of neurons. The winning pools may not only have a higher firing rate, but also more synchronous oscillatory activity. This fourth mode, oscillatory activity, leads to faster reaction times and enhanced information transfers in the model. This has indeed been observed experimentally. Moreover, oscillatory activity can format spike times and encode information in the spike phases with respect to the oscillatory cycle. This phenomenon is referred to as “phase-of-firing coding,” and experimental evidence for it is accumulating in the visual system. Simulations show that this code can again be efficiently decoded by STDP. Future work should focus on continuous natural vision, bio-inspired hardware vision systems, and novel experimental paradigms to further distinguish current modeling approaches

    Predictive Coding for Dynamic Visual Processing: Development of Functional Hierarchy in a Multiple Spatio-Temporal Scales RNN Model

    Get PDF
    The current paper proposes a novel predictive coding type neural network model, the predictive multiple spatio-temporal scales recurrent neural network (P-MSTRNN). The P-MSTRNN learns to predict visually perceived human whole-body cyclic movement patterns by exploiting multiscale spatio-temporal constraints imposed on network dynamics by using differently sized receptive fields as well as different time constant values for each layer. After learning, the network becomes able to proactively imitate target movement patterns by inferring or recognizing corresponding intentions by means of the regression of prediction error. Results show that the network can develop a functional hierarchy by developing a different type of dynamic structure at each layer. The paper examines how model performance during pattern generation as well as predictive imitation varies depending on the stage of learning. The number of limit cycle attractors corresponding to target movement patterns increases as learning proceeds. And, transient dynamics developing early in the learning process successfully perform pattern generation and predictive imitation tasks. The paper concludes that exploitation of transient dynamics facilitates successful task performance during early learning periods.Comment: Accepted in Neural Computation (MIT press
    corecore