569 research outputs found

    Predictive Coding for Dynamic Visual Processing: Development of Functional Hierarchy in a Multiple Spatio-Temporal Scales RNN Model

    Get PDF
    The current paper proposes a novel predictive coding type neural network model, the predictive multiple spatio-temporal scales recurrent neural network (P-MSTRNN). The P-MSTRNN learns to predict visually perceived human whole-body cyclic movement patterns by exploiting multiscale spatio-temporal constraints imposed on network dynamics by using differently sized receptive fields as well as different time constant values for each layer. After learning, the network becomes able to proactively imitate target movement patterns by inferring or recognizing corresponding intentions by means of the regression of prediction error. Results show that the network can develop a functional hierarchy by developing a different type of dynamic structure at each layer. The paper examines how model performance during pattern generation as well as predictive imitation varies depending on the stage of learning. The number of limit cycle attractors corresponding to target movement patterns increases as learning proceeds. And, transient dynamics developing early in the learning process successfully perform pattern generation and predictive imitation tasks. The paper concludes that exploitation of transient dynamics facilitates successful task performance during early learning periods.Comment: Accepted in Neural Computation (MIT press

    Achieving Synergy in Cognitive Behavior of Humanoids via Deep Learning of Dynamic Visuo-Motor-Attentional Coordination

    Full text link
    The current study examines how adequate coordination among different cognitive processes including visual recognition, attention switching, action preparation and generation can be developed via learning of robots by introducing a novel model, the Visuo-Motor Deep Dynamic Neural Network (VMDNN). The proposed model is built on coupling of a dynamic vision network, a motor generation network, and a higher level network allocated on top of these two. The simulation experiments using the iCub simulator were conducted for cognitive tasks including visual object manipulation responding to human gestures. The results showed that synergetic coordination can be developed via iterative learning through the whole network when spatio-temporal hierarchy and temporal one can be self-organized in the visual pathway and in the motor pathway, respectively, such that the higher level can manipulate them with abstraction.Comment: submitted to 2015 IEEE-RAS International Conference on Humanoid Robot

    Accounting for the Minimal Self and the Narrative Self: Robotics Experiments Using Predictive Coding

    Get PDF
    This paper proposes that the mind comprises emergent phenomena that appear via intricate and often conflicting interactions between top-down intentional processes involved in proactively acting on the external world, and bottom-up recognition processes involved in inferring possible causes for the resultant perceptual reality. This view has been tested via a series of neurorobotics experiments employing predictive coding principles implemented in “deep” recurrent neural network (RNN) models. The current paper illuminates phenomenological accounts of the minimal self and the narrative self from the analysis of those synthetic neurorobotics experiments.TOCAIS 2019 Towards Conscious AI Systems, Papers of the 2019 Towards Conscious AI Systems Symposium, co-located with the Association for the Advancement of Artificial Intelligence 2019 Spring Symposium Series (AAAI SSS-19) Stanford, CA, March 25-27, 2019

    A Survey on Different Deep Learning Model for Human Activity Recognition based on Application

    Get PDF
    The field of human activity recognition (HAR) seeks to identify and classify an individual's unique movements or activities. However, recognizing human activity from video is a challenging task that requires careful attention to individuals, their behaviors, and relevant body parts. Multimodal activity recognition systems are necessary for many applications, including video surveillance systems, human-computer interfaces, and robots that analyze human behavior. This study provides a comprehensive analysis of recent breakthroughs in human activity classification, including different approaches, methodologies, applications, and limitations. Additionally, the study identifies several challenges that require further investigation and improvements. The specifications for an ideal human activity recognition dataset are also discussed, along with a thorough examination of the publicly available human activity classification datasets

    Gestalt Perception of Biological motion with a Generative Artificial Neural Network Model

    Get PDF
    In cognitive modelling understanding of biological motion by inference of own sensorimotor skills is extremely valued and is known as a fundamental element of social intelligence. It has been suggested that a proper Gestalt perception depends on suitably binding visual features, decently adapting the matching perspective, and mapping the bound features onto the correct Gestalt templates. This thesis introduces a generative artificial neural network model, which implements such Gestalt perception mechanisms proposing an algorithmic explanation. The architectural design of the model is an extension, modification and further investigation of previous work by Fabian Schrodt \cite{Schrodt:2018} which relies on the principle of active inference and predictive coding, coupled with suitable inductive learning and processing biases. At first we train the model to learn sufficiently accurate generative models of dynamic biological, or other harmonic, motion patterns. Afterwards we scramble the input and vary the perspective onto it. To be able to properly route the input and adapt the internal perspective onto a known frame of reference, the suggested modularized architecture propagates the prediction error back onto a binding matrix which consists of hidden neural states that determine feature binding, and further back onto perspective taking neurons, which rotate and translate the input features. The resulting process ensures that various types of biological motion are inferred upon observation, resolving the challenges of (I) feature binding into Gestalten, (II) perspective taking, and (III) behavior interpretation. Ablation studies underline that, 1.~the separation of spatial input encodings into relative positional, directional, and motion magnitude pathways boost the quality of Gestalt perception, 2.~population encodings implicitly enable the parallel testing of alternative interpretation hypotheses and therefore further improve accurate inference, 3.~a temporal predictive processing module of the autoencoder-based compressed stimuli enables the retrospective inference of the unfolding behavior. I believe that similar components should be employed in other architectures where temporal bindings of information sources are beneficial. Moreover, given that binding, perspective taking, and intention interpretation are universal problems in cognitive science, our introduced mechanisms may be very useful for addressing similar challenges in other domains beyond biological motion patterns

    ToyArchitecture: Unsupervised Learning of Interpretable Models of the World

    Full text link
    Research in Artificial Intelligence (AI) has focused mostly on two extremes: either on small improvements in narrow AI domains, or on universal theoretical frameworks which are usually uncomputable, incompatible with theories of biological intelligence, or lack practical implementations. The goal of this work is to combine the main advantages of the two: to follow a big picture view, while providing a particular theory and its implementation. In contrast with purely theoretical approaches, the resulting architecture should be usable in realistic settings, but also form the core of a framework containing all the basic mechanisms, into which it should be easier to integrate additional required functionality. In this paper, we present a novel, purposely simple, and interpretable hierarchical architecture which combines multiple different mechanisms into one system: unsupervised learning of a model of the world, learning the influence of one's own actions on the world, model-based reinforcement learning, hierarchical planning and plan execution, and symbolic/sub-symbolic integration in general. The learned model is stored in the form of hierarchical representations with the following properties: 1) they are increasingly more abstract, but can retain details when needed, and 2) they are easy to manipulate in their local and symbolic-like form, thus also allowing one to observe the learning process at each level of abstraction. On all levels of the system, the representation of the data can be interpreted in both a symbolic and a sub-symbolic manner. This enables the architecture to learn efficiently using sub-symbolic methods and to employ symbolic inference.Comment: Revision: changed the pdftitl
    corecore