2,622 research outputs found

    Drifting perceptual patterns suggest prediction errors fusion rather than hypothesis selection: replicating the rubber-hand illusion on a robot

    Full text link
    Humans can experience fake body parts as theirs just by simple visuo-tactile synchronous stimulation. This body-illusion is accompanied by a drift in the perception of the real limb towards the fake limb, suggesting an update of body estimation resulting from stimulation. This work compares body limb drifting patterns of human participants, in a rubber hand illusion experiment, with the end-effector estimation displacement of a multisensory robotic arm enabled with predictive processing perception. Results show similar drifting patterns in both human and robot experiments, and they also suggest that the perceptual drift is due to prediction error fusion, rather than hypothesis selection. We present body inference through prediction error minimization as one single process that unites predictive coding and causal inference and that it is responsible for the effects in perception when we are subjected to intermodal sensory perturbations.Comment: Proceedings of the 2018 IEEE International Conference on Development and Learning and Epigenetic Robotic

    Sensor fusion in distributed cortical circuits

    Get PDF
    The substantial motion of the nature is to balance, to survive, and to reach perfection. The evolution in biological systems is a key signature of this quintessence. Survival cannot be achieved without understanding the surrounding world. How can a fruit fly live without searching for food, and thereby with no form of perception that guides the behavior? The nervous system of fruit fly with hundred thousand of neurons can perform very complicated tasks that are beyond the power of an advanced supercomputer. Recently developed computing machines are made by billions of transistors and they are remarkably fast in precise calculations. But these machines are unable to perform a single task that an insect is able to do by means of thousands of neurons. The complexity of information processing and data compression in a single biological neuron and neural circuits are not comparable with that of developed today in transistors and integrated circuits. On the other hand, the style of information processing in neural systems is also very different from that of employed by microprocessors which is mostly centralized. Almost all cognitive functions are generated by a combined effort of multiple brain areas. In mammals, Cortical regions are organized hierarchically, and they are reciprocally interconnected, exchanging the information from multiple senses. This hierarchy in circuit level, also preserves the sensory world within different levels of complexity and within the scope of multiple modalities. The main behavioral advantage of that is to understand the real-world through multiple sensory systems, and thereby to provide a robust and coherent form of perception. When the quality of a sensory signal drops, the brain can alternatively employ other information pathways to handle cognitive tasks, or even to calibrate the error-prone sensory node. Mammalian brain also takes a good advantage of multimodal processing in learning and development; where one sensory system helps another sensory modality to develop. Multisensory integration is considered as one of the main factors that generates consciousness in human. Although, we still do not know where exactly the information is consolidated into a single percept, and what is the underpinning neural mechanism of this process? One straightforward hypothesis suggests that the uni-sensory signals are pooled in a ploy-sensory convergence zone, which creates a unified form of perception. But it is hard to believe that there is just one single dedicated region that realizes this functionality. Using a set of realistic neuro-computational principles, I have explored theoretically how multisensory integration can be performed within a distributed hierarchical circuit. I argued that the interaction of cortical populations can be interpreted as a specific form of relation satisfaction in which the information preserved in one neural ensemble must agree with incoming signals from connected populations according to a relation function. This relation function can be seen as a coherency function which is implicitly learnt through synaptic strength. Apart from the fact that the real world is composed of multisensory attributes, the sensory signals are subject to uncertainty. This requires a cortical mechanism to incorporate the statistical parameters of the sensory world in neural circuits and to deal with the issue of inaccuracy in perception. I argued in this thesis how the intrinsic stochasticity of neural activity enables a systematic mechanism to encode probabilistic quantities within neural circuits, e.g. reliability, prior probability. The systematic benefit of neural stochasticity is well paraphrased by the problem of Duns Scotus paradox: imagine a donkey with a deterministic brain that is exposed to two identical food rewards. This may make the animal suffer and die starving because of indecision. In this thesis, I have introduced an optimal encoding framework that can describe the probability function of a Gaussian-like random variable in a pool of Poisson neurons. Thereafter a distributed neural model is proposed that can optimally combine conditional probabilities over sensory signals, in order to compute Bayesian Multisensory Causal Inference. This process is known as a complex multisensory function in the cortex. Recently it is found that this process is performed within a distributed hierarchy in sensory cortex. Our work is amongst the first successful attempts that put a mechanistic spotlight on understanding the underlying neural mechanism of Multisensory Causal Perception in the brain, and in general the theory of decentralized multisensory integration in sensory cortex. Engineering information processing concepts in the brain and developing new computing technologies have been recently growing. Neuromorphic Engineering is a new branch that undertakes this mission. In a dedicated part of this thesis, I have proposed a Neuromorphic algorithm for event-based stereoscopic fusion. This algorithm is anchored in the idea of cooperative computing that dictates the defined epipolar and temporal constraints of the stereoscopic setup, to the neural dynamics. The performance of this algorithm is tested using a pair of silicon retinas

    Sensor fusion in distributed cortical circuits

    Get PDF
    The substantial motion of the nature is to balance, to survive, and to reach perfection. The evolution in biological systems is a key signature of this quintessence. Survival cannot be achieved without understanding the surrounding world. How can a fruit fly live without searching for food, and thereby with no form of perception that guides the behavior? The nervous system of fruit fly with hundred thousand of neurons can perform very complicated tasks that are beyond the power of an advanced supercomputer. Recently developed computing machines are made by billions of transistors and they are remarkably fast in precise calculations. But these machines are unable to perform a single task that an insect is able to do by means of thousands of neurons. The complexity of information processing and data compression in a single biological neuron and neural circuits are not comparable with that of developed today in transistors and integrated circuits. On the other hand, the style of information processing in neural systems is also very different from that of employed by microprocessors which is mostly centralized. Almost all cognitive functions are generated by a combined effort of multiple brain areas. In mammals, Cortical regions are organized hierarchically, and they are reciprocally interconnected, exchanging the information from multiple senses. This hierarchy in circuit level, also preserves the sensory world within different levels of complexity and within the scope of multiple modalities. The main behavioral advantage of that is to understand the real-world through multiple sensory systems, and thereby to provide a robust and coherent form of perception. When the quality of a sensory signal drops, the brain can alternatively employ other information pathways to handle cognitive tasks, or even to calibrate the error-prone sensory node. Mammalian brain also takes a good advantage of multimodal processing in learning and development; where one sensory system helps another sensory modality to develop. Multisensory integration is considered as one of the main factors that generates consciousness in human. Although, we still do not know where exactly the information is consolidated into a single percept, and what is the underpinning neural mechanism of this process? One straightforward hypothesis suggests that the uni-sensory signals are pooled in a ploy-sensory convergence zone, which creates a unified form of perception. But it is hard to believe that there is just one single dedicated region that realizes this functionality. Using a set of realistic neuro-computational principles, I have explored theoretically how multisensory integration can be performed within a distributed hierarchical circuit. I argued that the interaction of cortical populations can be interpreted as a specific form of relation satisfaction in which the information preserved in one neural ensemble must agree with incoming signals from connected populations according to a relation function. This relation function can be seen as a coherency function which is implicitly learnt through synaptic strength. Apart from the fact that the real world is composed of multisensory attributes, the sensory signals are subject to uncertainty. This requires a cortical mechanism to incorporate the statistical parameters of the sensory world in neural circuits and to deal with the issue of inaccuracy in perception. I argued in this thesis how the intrinsic stochasticity of neural activity enables a systematic mechanism to encode probabilistic quantities within neural circuits, e.g. reliability, prior probability. The systematic benefit of neural stochasticity is well paraphrased by the problem of Duns Scotus paradox: imagine a donkey with a deterministic brain that is exposed to two identical food rewards. This may make the animal suffer and die starving because of indecision. In this thesis, I have introduced an optimal encoding framework that can describe the probability function of a Gaussian-like random variable in a pool of Poisson neurons. Thereafter a distributed neural model is proposed that can optimally combine conditional probabilities over sensory signals, in order to compute Bayesian Multisensory Causal Inference. This process is known as a complex multisensory function in the cortex. Recently it is found that this process is performed within a distributed hierarchy in sensory cortex. Our work is amongst the first successful attempts that put a mechanistic spotlight on understanding the underlying neural mechanism of Multisensory Causal Perception in the brain, and in general the theory of decentralized multisensory integration in sensory cortex. Engineering information processing concepts in the brain and developing new computing technologies have been recently growing. Neuromorphic Engineering is a new branch that undertakes this mission. In a dedicated part of this thesis, I have proposed a Neuromorphic algorithm for event-based stereoscopic fusion. This algorithm is anchored in the idea of cooperative computing that dictates the defined epipolar and temporal constraints of the stereoscopic setup, to the neural dynamics. The performance of this algorithm is tested using a pair of silicon retinas

    Self-supervised learning: When is fusion of the primary and secondary sensor cue useful?

    Get PDF
    Self-supervised learning (SSL) is a reliable learning mechanism in which a robot enhances its perceptual capabilities. Typically, in SSL a trusted, primary sensor cue provides supervised training data to a secondary sensor cue. In this article, a theoretical analysis is performed on the fusion of the primary and secondary cue in a minimal model of SSL. A proof is provided that determines the specific conditions under which it is favorable to perform fusion. In short, it is favorable when (i) the prior on the target value is strong or (ii) the secondary cue is sufficiently accurate. The theoretical findings are validated with computational experiments. Subsequently, a real-world case study is performed to investigate if fusion in SSL is also beneficial when assumptions of the minimal model are not met. In particular, a flying robot learns to map pressure measurements to sonar height measurements and then fuses the two, resulting in better height estimation. Fusion is also beneficial in the opposite case, when pressure is the primary cue. The analysis and results are encouraging to study SSL fusion also for other robots and sensors

    Emotion Recognition from Acted and Spontaneous Speech

    Get PDF
    DizertačnĂ­ prĂĄce se zabĂœvĂĄ rozpoznĂĄnĂ­m emočnĂ­ho stavu mluvčích z ƙečovĂ©ho signĂĄlu. PrĂĄce je rozdělena do dvou hlavnĂ­ch častĂ­, prvnĂ­ část popisuju navrĆŸenĂ© metody pro rozpoznĂĄnĂ­ emočnĂ­ho stavu z hranĂœch databĂĄzĂ­. V rĂĄmci tĂ©to části jsou pƙedstaveny vĂœsledky rozpoznĂĄnĂ­ pouĆŸitĂ­m dvou rĆŻznĂœch databĂĄzĂ­ s rĆŻznĂœmi jazyky. HlavnĂ­mi pƙínosy tĂ©to části je detailnĂ­ analĂœza rozsĂĄhlĂ© ĆĄkĂĄly rĆŻznĂœch pƙíznakĆŻ zĂ­skanĂœch z ƙečovĂ©ho signĂĄlu, nĂĄvrh novĂœch klasifikačnĂ­ch architektur jako je napƙíklad „emočnĂ­ pĂĄrovĂĄní“ a nĂĄvrh novĂ© metody pro mapovĂĄnĂ­ diskrĂ©tnĂ­ch emočnĂ­ch stavĆŻ do dvou dimenzionĂĄlnĂ­ho prostoru. DruhĂĄ část se zabĂœvĂĄ rozpoznĂĄnĂ­m emočnĂ­ch stavĆŻ z databĂĄze spontĂĄnnĂ­ ƙeči, kterĂĄ byla zĂ­skĂĄna ze zĂĄznamĆŻ hovorĆŻ z reĂĄlnĂœch call center. Poznatky z analĂœzy a nĂĄvrhu metod rozpoznĂĄnĂ­ z hranĂ© ƙeči byly vyuĆŸity pro nĂĄvrh novĂ©ho systĂ©mu pro rozpoznĂĄnĂ­ sedmi spontĂĄnnĂ­ch emočnĂ­ch stavĆŻ. JĂĄdrem navrĆŸenĂ©ho pƙístupu je komplexnĂ­ klasifikačnĂ­ architektura zaloĆŸena na fĂșzi rĆŻznĂœch systĂ©mĆŻ. PrĂĄce se dĂĄle zabĂœvĂĄ vlivem emočnĂ­ho stavu mluvčího na Ășspěơnosti rozpoznĂĄnĂ­ pohlavĂ­ a nĂĄvrhem systĂ©mu pro automatickou detekci ĂșspěơnĂœch hovorĆŻ v call centrech na zĂĄkladě analĂœzy parametrĆŻ dialogu mezi ĂșčastnĂ­ky telefonnĂ­ch hovorĆŻ.Doctoral thesis deals with emotion recognition from speech signals. The thesis is divided into two main parts; the first part describes proposed approaches for emotion recognition using two different multilingual databases of acted emotional speech. The main contributions of this part are detailed analysis of a big set of acoustic features, new classification schemes for vocal emotion recognition such as “emotion coupling” and new method for mapping discrete emotions into two-dimensional space. The second part of this thesis is devoted to emotion recognition using multilingual databases of spontaneous emotional speech, which is based on telephone records obtained from real call centers. The knowledge gained from experiments with emotion recognition from acted speech was exploited to design a new approach for classifying seven emotional states. The core of the proposed approach is a complex classification architecture based on the fusion of different systems. The thesis also examines the influence of speaker’s emotional state on gender recognition performance and proposes system for automatic identification of successful phone calls in call center by means of dialogue features.

    Integration of perceptal grouping and depth

    Get PDF
    International Conference on Pattern Recognition (ICPR), 2000, Barcelona (España)Different data acquisition methods are tailored at extracting particular characteristics from a scene and by combining their results a more robust scene description can be created. A method to fuse perceptual groupings extracted from color-based segmentation and depth information from stereo using supervised classification is presented. The merging of data from these two acquisition modules allows for a spatially coherent blend of smooth regions and detail in an image. Depth cues are used to limit the area of interest in the scene and to improve perceptual grouping solving subsegmentation and oversegmentation of the original images. The complexity of the algorithm does not exceed that of the individual acquisition modules. The resulting scene description can then be fed to an object recognition modules for scene interpretation.This work was supported by the project 'Active vision systems based in automatic learning for industrial applications' ().Peer Reviewe

    Cue Integration in Categorical Tasks: Insights from Audio-Visual Speech Perception

    Get PDF
    Previous cue integration studies have examined continuous perceptual dimensions (e.g., size) and have shown that human cue integration is well described by a normative model in which cues are weighted in proportion to their sensory reliability, as estimated from single-cue performance. However, this normative model may not be applicable to categorical perceptual dimensions (e.g., phonemes). In tasks defined over categorical perceptual dimensions, optimal cue weights should depend not only on the sensory variance affecting the perception of each cue but also on the environmental variance inherent in each task-relevant category. Here, we present a computational and experimental investigation of cue integration in a categorical audio-visual (articulatory) speech perception task. Our results show that human performance during audio-visual phonemic labeling is qualitatively consistent with the behavior of a Bayes-optimal observer. Specifically, we show that the participants in our task are sensitive, on a trial-by-trial basis, to the sensory uncertainty associated with the auditory and visual cues, during phonemic categorization. In addition, we show that while sensory uncertainty is a significant factor in determining cue weights, it is not the only one and participants' performance is consistent with an optimal model in which environmental, within category variability also plays a role in determining cue weights. Furthermore, we show that in our task, the sensory variability affecting the visual modality during cue-combination is not well estimated from single-cue performance, but can be estimated from multi-cue performance. The findings and computational principles described here represent a principled first step towards characterizing the mechanisms underlying human cue integration in categorical tasks
    • 

    corecore