9 research outputs found

    Adaptive saccade controller inspired by the primates' cerebellum

    Get PDF
    Saccades are fast eye movements that allow humans and robots to bring the visual target in the center of the visual field. Saccades are open loop with respect to the vision system, thus their execution require a precise knowledge of the internal model of the oculomotor system. In this work, we modeled the saccade control, taking inspiration from the recurrent loops between the cerebellum and the brainstem. In this model, the brainstem acts as a fixed-inverse model of the oculomotor system, while the cerebellum acts as an adaptive element that learns the internal model of the oculomotor system. The adaptive filter is implemented using a state-of-the-art neural network, called I-SSGPR. The proposed approach, namely recurrent architecture, was validated through experiments performed both in simulation and on an antropomorphic robotic head. Moreover, we compared the recurrent architecture with another model of the cerebellum, the feedback error learning. Achieved results show that the recurrent architecture outperforms the feedback error learning in terms of accuracy and insensitivity to the choice of the feedback controller

    Adaptive saccade controller inspired by the primates' cerebellum

    Get PDF
    Saccades are fast eye movements that allow humans and robots to bring the visual target in the center of the visual field. Saccades are open loop with respect to the vision system, thus their execution require a precise knowledge of the internal model of the oculomotor system. In this work, we modeled the saccade control, taking inspiration from the recurrent loops between the cerebellum and the brainstem. In this model, the brainstem acts as a fixed-inverse model of the oculomotor system, while the cerebellum acts as an adaptive element that learns the internal model of the oculomotor system. The adaptive filter is implemented using a state-of-the-art neural network, called I-SSGPR. The proposed approach, namely recurrent architecture, was validated through experiments performed both in simulation and on an antropomorphic robotic head. Moreover, we compared the recurrent architecture with another model of the cerebellum, the feedback error learning. Achieved results show that the recurrent architecture outperforms the feedback error learning in terms of accuracy and insensitivity to the choice of the feedback controller

    Learning the visual–oculomotor transformation: effects on saccade control and space representation

    Get PDF
    Active eye movements can be exploited to build a visuomotor representation of the surrounding environment. Maintaining and improving such representation requires to update the internal model involved in the generation of eye movements. From this perspective, action and perception are thus tightly coupled and interdependent. In this work, we encoded the internal model for oculomotor control with an adaptive filter inspired by the functionality of the cerebellum. Recurrent loops between a feed-back controller and the internal model allow our system to perform accurate binocular saccades and create an implicit representation of the nearby space. Simulation results show that this recurrent architecture outperforms classical feedback-error-learning in terms of both accuracy and sensitivity to system parameters. The proposed approach was validated implementing the framework on an anthropomorphic robotic head

    A hierarchical system for a distributed representation of the peripersonal space of a humanoid robot

    Get PDF
    Reaching a target object in an unknown and unstructured environment is easily performed by human beings. However, designing a humanoid robot that executes the same task requires the implementation of complex abilities, such as identifying the target in the visual field, estimating its spatial location, and precisely driving the motors of the arm to reach it. While research usually tackles the development of such abilities singularly, in this work we integrate a number of computational models into a unified framework, and demonstrate in a humanoid torso the feasibility of an integrated working representation of its peripersonal space. To achieve this goal, we propose a cognitive architecture that connects several models inspired by neural circuits of the visual, frontal and posterior parietal cortices of the brain. The outcome of the integration process is a system that allows the robot to create its internal model and its representation of the surrounding space by interacting with the environment directly, through a mutual adaptation of perception and action. The robot is eventually capable of executing a set of tasks, such as recognizing, gazing and reaching target objects, which can work separately or cooperate for supporting more structured and effective behaviors

    A hierarchical system for a distributed representation of the peripersonal space of a humanoid robot

    Get PDF
    Reaching a target object in an unknown and unstructured environment is easily performed by human beings. However, designing a humanoid robot that executes the same task requires the implementation of complex abilities, such as identifying the target in the visual field, estimating its spatial location, and precisely driving the motors of the arm to reach it. While research usually tackles the development of such abilities singularly, in this work we integrate a number of computational models into a unified framework, and demonstrate in a humanoid torso the feasibility of an integrated working representation of its peripersonal space. To achieve this goal, we propose a cognitive architecture that connects several models inspired by neural circuits of the visual, frontal and posterior parietal cortices of the brain. The outcome of the integration process is a system that allows the robot to create its internal model and its representation of the surrounding space by interacting with the environment directly, through a mutual adaptation of perception and action. The robot is eventually capable of executing a set of tasks, such as recognizing, gazing and reaching target objects, which can work separately or cooperate for supporting more structured and effective behaviors

    Learning Saccadic Gaze Control via Motion Prediciton

    No full text

    Gaze control for visually guided manipulation

    Get PDF
    Human studies have shown that gaze shifts are mostly driven by the task. One explanation is that fixations gather information about task relevant properties, where task relevance is signalled by reward. This thesis pursues primarily an engineering science goal to determine what mechanisms a rational decision maker could employ to select a gaze location optimally, or near optimally, given limited information and limited computation time. To do so we formulate and characterise three computational models of gaze shifting (implemented on a simulated humanoid robot), which use lookahead to imagine the informational effects of possible gaze fixations. Our first model selects the gaze that most reduces uncertainty in the scene (Unc), the second maximises expected rewards by reducing uncertainty (Rew+Unc), and the third maximises the expected gain in cumulative reward by reducing uncertainty (Rew+Unc+Gain). We also present an integrated account of a visual search process into the Rew+Unc+Gain gaze scheme. Our secondary goal is concerned with the way in which humans might select the next gaze location. We compare the hand-eye coordination timings of our models to previously published human data, and we provide evidence that only the models that incorporate both uncertainty and reward (Rew+Unc and Rew+Unc+Gain) match human data

    A hierarchical active binocular robot vision architecture for scene exploration and object appearance learning

    Get PDF
    This thesis presents an investigation of a computational model of hierarchical visual behaviours within an active binocular robot vision architecture. The robot vision system is able to localise multiple instances of the same object class, while simultaneously maintaining vergence and directing its gaze to attend and recognise objects within cluttered, complex scenes. This is achieved by implementing all image analysis in an egocentric symbolic space without creating explicit pixel-space maps and without the need for calibration or other knowledge of the camera geometry. One of the important aspects of the active binocular vision paradigm requires that visual features in both camera eyes must be bound together in order to drive visual search to saccade, locate and recognise putative objects or salient locations in the robot's field of view. The system structure is based on the “attentional spotlight” metaphor of biological systems and a collection of abstract and reactive visual behaviours arranged in a hierarchical structure. Several studies have shown that the human brain represents and learns objects for recognition by snapshots of 2-dimensional views of the imaged scene that happens to contain the object of interest during active interaction (exploration) of the environment. Likewise, psychophysical findings specify that the primate’s visual cortex represents common everyday objects by a hierarchical structure of their parts or sub-features and, consequently, recognise by simple but imperfect 2D view object part approximations. This thesis incorporates the above observations into an active visual learning behaviour in the hierarchical active binocular robot vision architecture. By actively exploring the object viewing sphere (as higher mammals do), the robot vision system automatically synthesises and creates its own part-based object representation from multiple observations while a human teacher indicates the object and supplies a classification name. Its is proposed to adopt the computational concepts of a visual learning exploration mechanism that controls the accumulation of visual evidence and directs attention towards the spatial salient object parts. The behavioural structure of the binocular robot vision architecture is loosely modelled by a WHAT and WHERE visual streams. The WHERE stream maintains and binds spatial attention on the object part coordinates that egocentrically characterises the location of the object of interest and extracts spatio-temporal properties of feature coordinates and descriptors. The WHAT stream either determines the identity of an object or triggers a learning behaviour that stores view-invariant feature descriptions of the object part. Therefore, the robot vision is capable to perform a collection of different specific visual tasks such as vergence, detection, discrimination, recognition localisation and multiple same-instance identification. This classification of tasks enables the robot vision system to execute and fulfil specified high-level tasks, e.g. autonomous scene exploration and active object appearance learning
    corecore