15,786 research outputs found

    Symbol Emergence in Robotics: A Survey

    Full text link
    Humans can learn the use of language through physical interaction with their environment and semiotic communication with other people. It is very important to obtain a computational understanding of how humans can form a symbol system and obtain semiotic skills through their autonomous mental development. Recently, many studies have been conducted on the construction of robotic systems and machine-learning methods that can learn the use of language through embodied multimodal interaction with their environment and other systems. Understanding human social interactions and developing a robot that can smoothly communicate with human users in the long term, requires an understanding of the dynamics of symbol systems and is crucially important. The embodied cognition and social interaction of participants gradually change a symbol system in a constructive manner. In this paper, we introduce a field of research called symbol emergence in robotics (SER). SER is a constructive approach towards an emergent symbol system. The emergent symbol system is socially self-organized through both semiotic communications and physical interactions with autonomous cognitive developmental agents, i.e., humans and developmental robots. Specifically, we describe some state-of-art research topics concerning SER, e.g., multimodal categorization, word discovery, and a double articulation analysis, that enable a robot to obtain words and their embodied meanings from raw sensory--motor information, including visual information, haptic information, auditory information, and acoustic speech signals, in a totally unsupervised manner. Finally, we suggest future directions of research in SER.Comment: submitted to Advanced Robotic

    Predictive Coding for Dynamic Visual Processing: Development of Functional Hierarchy in a Multiple Spatio-Temporal Scales RNN Model

    Get PDF
    The current paper proposes a novel predictive coding type neural network model, the predictive multiple spatio-temporal scales recurrent neural network (P-MSTRNN). The P-MSTRNN learns to predict visually perceived human whole-body cyclic movement patterns by exploiting multiscale spatio-temporal constraints imposed on network dynamics by using differently sized receptive fields as well as different time constant values for each layer. After learning, the network becomes able to proactively imitate target movement patterns by inferring or recognizing corresponding intentions by means of the regression of prediction error. Results show that the network can develop a functional hierarchy by developing a different type of dynamic structure at each layer. The paper examines how model performance during pattern generation as well as predictive imitation varies depending on the stage of learning. The number of limit cycle attractors corresponding to target movement patterns increases as learning proceeds. And, transient dynamics developing early in the learning process successfully perform pattern generation and predictive imitation tasks. The paper concludes that exploitation of transient dynamics facilitates successful task performance during early learning periods.Comment: Accepted in Neural Computation (MIT press

    Neural Mechanisms for Information Compression by Multiple Alignment, Unification and Search

    Get PDF
    This article describes how an abstract framework for perception and cognition may be realised in terms of neural mechanisms and neural processing. This framework — called information compression by multiple alignment, unification and search (ICMAUS) — has been developed in previous research as a generalized model of any system for processing information, either natural or artificial. It has a range of applications including the analysis and production of natural language, unsupervised inductive learning, recognition of objects and patterns, probabilistic reasoning, and others. The proposals in this article may be seen as an extension and development of Hebb’s (1949) concept of a ‘cell assembly’. The article describes how the concept of ‘pattern’ in the ICMAUS framework may be mapped onto a version of the cell assembly concept and the way in which neural mechanisms may achieve the effect of ‘multiple alignment’ in the ICMAUS framework. By contrast with the Hebbian concept of a cell assembly, it is proposed here that any one neuron can belong in one assembly and only one assembly. A key feature of present proposals, which is not part of the Hebbian concept, is that any cell assembly may contain ‘references’ or ‘codes’ that serve to identify one or more other cell assemblies. This mechanism allows information to be stored in a compressed form, it provides a robust mechanism by which assemblies may be connected to form hierarchies and other kinds of structure, it means that assemblies can express abstract concepts, and it provides solutions to some of the other problems associated with cell assemblies. Drawing on insights derived from the ICMAUS framework, the article also describes how learning may be achieved with neural mechanisms. This concept of learning is significantly different from the Hebbian concept and appears to provide a better account of what we know about human learning

    Unsupervised Learning of Complex Articulated Kinematic Structures combining Motion and Skeleton Information

    Get PDF
    In this paper we present a novel framework for unsupervised kinematic structure learning of complex articulated objects from a single-view image sequence. In contrast to prior motion information based methods, which estimate relatively simple articulations, our method can generate arbitrarily complex kinematic structures with skeletal topology by a successive iterative merge process. The iterative merge process is guided by a skeleton distance function which is generated from a novel object boundary generation method from sparse points. Our main contributions can be summarised as follows: (i) Unsupervised complex articulated kinematic structure learning by combining motion and skeleton information. (ii) Iterative fine-to-coarse merging strategy for adaptive motion segmentation and structure smoothing. (iii) Skeleton estimation from sparse feature points. (iv) A new highly articulated object dataset containing multi-stage complexity with ground truth. Our experiments show that the proposed method out-performs state-of-the-art methods both quantitatively and qualitatively

    Driven by Compression Progress: A Simple Principle Explains Essential Aspects of Subjective Beauty, Novelty, Surprise, Interestingness, Attention, Curiosity, Creativity, Art, Science, Music, Jokes

    Get PDF
    I argue that data becomes temporarily interesting by itself to some self-improving, but computationally limited, subjective observer once he learns to predict or compress the data in a better way, thus making it subjectively simpler and more beautiful. Curiosity is the desire to create or discover more non-random, non-arbitrary, regular data that is novel and surprising not in the traditional sense of Boltzmann and Shannon but in the sense that it allows for compression progress because its regularity was not yet known. This drive maximizes interestingness, the first derivative of subjective beauty or compressibility, that is, the steepness of the learning curve. It motivates exploring infants, pure mathematicians, composers, artists, dancers, comedians, yourself, and (since 1990) artificial systems.Comment: 35 pages, 3 figures, based on KES 2008 keynote and ALT 2007 / DS 2007 joint invited lectur

    SERKET: An Architecture for Connecting Stochastic Models to Realize a Large-Scale Cognitive Model

    Full text link
    To realize human-like robot intelligence, a large-scale cognitive architecture is required for robots to understand the environment through a variety of sensors with which they are equipped. In this paper, we propose a novel framework named Serket that enables the construction of a large-scale generative model and its inference easily by connecting sub-modules to allow the robots to acquire various capabilities through interaction with their environments and others. We consider that large-scale cognitive models can be constructed by connecting smaller fundamental models hierarchically while maintaining their programmatic independence. Moreover, connected modules are dependent on each other, and parameters are required to be optimized as a whole. Conventionally, the equations for parameter estimation have to be derived and implemented depending on the models. However, it becomes harder to derive and implement those of a larger scale model. To solve these problems, in this paper, we propose a method for parameter estimation by communicating the minimal parameters between various modules while maintaining their programmatic independence. Therefore, Serket makes it easy to construct large-scale models and estimate their parameters via the connection of modules. Experimental results demonstrated that the model can be constructed by connecting modules, the parameters can be optimized as a whole, and they are comparable with the original models that we have proposed

    Vector Associative Maps: Unsupervised Real-time Error-based Learning and Control of Movement Trajectories

    Full text link
    This article describes neural network models for adaptive control of arm movement trajectories during visually guided reaching and, more generally, a framework for unsupervised real-time error-based learning. The models clarify how a child, or untrained robot, can learn to reach for objects that it sees. Piaget has provided basic insights with his concept of a circular reaction: As an infant makes internally generated movements of its hand, the eyes automatically follow this motion. A transformation is learned between the visual representation of hand position and the motor representation of hand position. Learning of this transformation eventually enables the child to accurately reach for visually detected targets. Grossberg and Kuperstein have shown how the eye movement system can use visual error signals to correct movement parameters via cerebellar learning. Here it is shown how endogenously generated arm movements lead to adaptive tuning of arm control parameters. These movements also activate the target position representations that are used to learn the visuo-motor transformation that controls visually guided reaching. The AVITE model presented here is an adaptive neural circuit based on the Vector Integration to Endpoint (VITE) model for arm and speech trajectory generation of Bullock and Grossberg. In the VITE model, a Target Position Command (TPC) represents the location of the desired target. The Present Position Command (PPC) encodes the present hand-arm configuration. The Difference Vector (DV) population continuously.computes the difference between the PPC and the TPC. A speed-controlling GO signal multiplies DV output. The PPC integrates the (DV)·(GO) product and generates an outflow command to the arm. Integration at the PPC continues at a rate dependent on GO signal size until the DV reaches zero, at which time the PPC equals the TPC. The AVITE model explains how self-consistent TPC and PPC coordinates are autonomously generated and learned. Learning of AVITE parameters is regulated by activation of a self-regulating Endogenous Random Generator (ERG) of training vectors. Each vector is integrated at the PPC, giving rise to a movement command. The generation of each vector induces a complementary postural phase during which ERG output stops and learning occurs. Then a new vector is generated and the cycle is repeated. This cyclic, biphasic behavior is controlled by a specialized gated dipole circuit. ERG output autonomously stops in such a way that, across trials, a broad sample of workspace target positions is generated. When the ERG shuts off, a modulator gate opens, copying the PPC into the TPC. Learning of a transformation from TPC to PPC occurs using the DV as an error signal that is zeroed due to learning. This learning scheme is called a Vector Associative Map, or VAM. The VAM model is a general-purpose device for autonomous real-time error-based learning and performance of associative maps. The DV stage serves the dual function of reading out new TPCs during performance and reading in new adaptive weights during learning, without a disruption of real-time operation. YAMs thus provide an on-line unsupervised alternative to the off-line properties of supervised error-correction learning algorithms. YAMs and VAM cascades for learning motor-to-motor and spatial-to-motor maps are described. YAM models and Adaptive Resonance Theory (ART) models exhibit complementary matching, learning, and performance properties that together provide a foundation for designing a total sensory-cognitive and cognitive-motor autonomous system.National Science Foundation (IRI-87-16960, IRI-87-6960); Air Force Office of Scientific Research (90-0175); Defense Advanced Research Projects Agency (90-0083

    A Real-Time Unsupervised Neural Network for the Low-Level Control of a Mobile Robot in a Nonstationary Environment

    Full text link
    This article introduces a real-time, unsupervised neural network that learns to control a two-degree-of-freedom mobile robot in a nonstationary environment. The neural controller, which is termed neural NETwork MObile Robot Controller (NETMORC), combines associative learning and Vector Associative Map (YAM) learning to generate transformations between spatial and velocity coordinates. As a result, the controller learns the wheel velocities required to reach a target at an arbitrary distance and angle. The transformations are learned during an unsupervised training phase, during which the robot moves as a result of randomly selected wheel velocities. The robot learns the relationship between these velocities and the resulting incremental movements. Aside form being able to reach stationary or moving targets, the NETMORC structure also enables the robot to perform successfully in spite of disturbances in the enviroment, such as wheel slippage, or changes in the robot's plant, including changes in wheel radius, changes in inter-wheel distance, or changes in the internal time step of the system. Finally, the controller is extended to include a module that learns an internal odometric transformation, allowing the robot to reach targets when visual input is sporadic or unreliable.Sloan Fellowship (BR-3122), Air Force Office of Scientific Research (F49620-92-J-0499
    • …
    corecore