76,148 research outputs found

    Symbols are not uniquely human

    Get PDF
    Modern semiotics is a branch of logics that formally defines symbol-based communication. In recent years, the semiotic classification of signs has been invoked to support the notion that symbols are uniquely human. Here we show that alarm-calls such as those used by African vervet monkeys (Cercopithecus aethiops), logically satisfy the semiotic definition of symbol. We also show that the acquisition of vocal symbols in vervet monkeys can be successfully simulated by a computer program based on minimal semiotic and neurobiological constraints. The simulations indicate that learning depends on the tutor-predator ratio, and that apprentice-generated auditory mistakes in vocal symbol interpretation have little effect on the learning rates of apprentices (up to 80% of mistakes are tolerated). In contrast, just 10% of apprentice-generated visual mistakes in predator identification will prevent any vocal symbol to be correctly associated with a predator call in a stable manner. Tutor unreliability was also deleterious to vocal symbol learning: a mere 5% of “lying” tutors were able to completely disrupt symbol learning, invariably leading to the acquisition of incorrect associations by apprentices. Our investigation corroborates the existence of vocal symbols in a non-human species, and indicates that symbolic competence emerges spontaneously from classical associative learning mechanisms when the conditioned stimuli are self-generated, arbitrary and socially efficacious. We propose that more exclusive properties of human language, such as syntax, may derive from the evolution of higher-order domains for neural association, more removed from both the sensory input and the motor output, able to support the gradual complexification of grammatical categories into syntax

    Deep Affordance-grounded Sensorimotor Object Recognition

    Full text link
    It is well-established by cognitive neuroscience that human perception of objects constitutes a complex process, where object appearance information is combined with evidence about the so-called object "affordances", namely the types of actions that humans typically perform when interacting with them. This fact has recently motivated the "sensorimotor" approach to the challenging task of automatic object recognition, where both information sources are fused to improve robustness. In this work, the aforementioned paradigm is adopted, surpassing current limitations of sensorimotor object recognition research. Specifically, the deep learning paradigm is introduced to the problem for the first time, developing a number of novel neuro-biologically and neuro-physiologically inspired architectures that utilize state-of-the-art neural networks for fusing the available information sources in multiple ways. The proposed methods are evaluated using a large RGB-D corpus, which is specifically collected for the task of sensorimotor object recognition and is made publicly available. Experimental results demonstrate the utility of affordance information to object recognition, achieving an up to 29% relative error reduction by its inclusion.Comment: 9 pages, 7 figures, dataset link included, accepted to CVPR 201

    Neural blackboard architectures of combinatorial structures in cognition

    Get PDF
    Human cognition is unique in the way in which it relies on combinatorial (or compositional) structures. Language provides ample evidence for the existence of combinatorial structures, but they can also be found in visual cognition. To understand the neural basis of human cognition, it is therefore essential to understand how combinatorial structures can be instantiated in neural terms. In his recent book on the foundations of language, Jackendoff described four fundamental problems for a neural instantiation of combinatorial structures: the massiveness of the binding problem, the problem of 2, the problem of variables and the transformation of combinatorial structures from working memory to long-term memory. This paper aims to show that these problems can be solved by means of neural ‘blackboard’ architectures. For this purpose, a neural blackboard architecture for sentence structure is presented. In this architecture, neural structures that encode for words are temporarily bound in a manner that preserves the structure of the sentence. It is shown that the architecture solves the four problems presented by Jackendoff. The ability of the architecture to instantiate sentence structures is illustrated with examples of sentence complexity observed in human language performance. Similarities exist between the architecture for sentence structure and blackboard architectures for combinatorial structures in visual cognition, derived from the structure of the visual cortex. These architectures are briefly discussed, together with an example of a combinatorial structure in which the blackboard architectures for language and vision are combined. In this way, the architecture for language is grounded in perception

    Visual pathways from the perspective of cost functions and multi-task deep neural networks

    Get PDF
    Vision research has been shaped by the seminal insight that we can understand the higher-tier visual cortex from the perspective of multiple functional pathways with different goals. In this paper, we try to give a computational account of the functional organization of this system by reasoning from the perspective of multi-task deep neural networks. Machine learning has shown that tasks become easier to solve when they are decomposed into subtasks with their own cost function. We hypothesize that the visual system optimizes multiple cost functions of unrelated tasks and this causes the emergence of a ventral pathway dedicated to vision for perception, and a dorsal pathway dedicated to vision for action. To evaluate the functional organization in multi-task deep neural networks, we propose a method that measures the contribution of a unit towards each task, applying it to two networks that have been trained on either two related or two unrelated tasks, using an identical stimulus set. Results show that the network trained on the unrelated tasks shows a decreasing degree of feature representation sharing towards higher-tier layers while the network trained on related tasks uniformly shows high degree of sharing. We conjecture that the method we propose can be used to analyze the anatomical and functional organization of the visual system and beyond. We predict that the degree to which tasks are related is a good descriptor of the degree to which they share downstream cortical-units.Comment: 16 pages, 5 figure

    Subitizing with Variational Autoencoders

    Full text link
    Numerosity, the number of objects in a set, is a basic property of a given visual scene. Many animals develop the perceptual ability to subitize: the near-instantaneous identification of the numerosity in small sets of visual items. In computer vision, it has been shown that numerosity emerges as a statistical property in neural networks during unsupervised learning from simple synthetic images. In this work, we focus on more complex natural images using unsupervised hierarchical neural networks. Specifically, we show that variational autoencoders are able to spontaneously perform subitizing after training without supervision on a large amount images from the Salient Object Subitizing dataset. While our method is unable to outperform supervised convolutional networks for subitizing, we observe that the networks learn to encode numerosity as basic visual property. Moreover, we find that the learned representations are likely invariant to object area; an observation in alignment with studies on biological neural networks in cognitive neuroscience

    Adaptive Resonance Theory: Self-Organizing Networks for Stable Learning, Recognition, and Prediction

    Full text link
    Adaptive Resonance Theory (ART) is a neural theory of human and primate information processing and of adaptive pattern recognition and prediction for technology. Biological applications to attentive learning of visual recognition categories by inferotemporal cortex and hippocampal system, medial temporal amnesia, corticogeniculate synchronization, auditory streaming, speech recognition, and eye movement control are noted. ARTMAP systems for technology integrate neural networks, fuzzy logic, and expert production systems to carry out both unsupervised and supervised learning. Fast and slow learning are both stable response to large non stationary databases. Match tracking search conjointly maximizes learned compression while minimizing predictive error. Spatial and temporal evidence accumulation improve accuracy in 3-D object recognition. Other applications are noted.Office of Naval Research (N00014-95-I-0657, N00014-95-1-0409, N00014-92-J-1309, N00014-92-J4015); National Science Foundation (IRI-94-1659
    corecore