105 research outputs found

    Virtual Reality to Simulate Visual Tasks for Robotic Systems

    Get PDF
    Virtual reality (VR) can be used as a tool to analyze the interactions between the visual system of a robotic agent and the environment, with the aim of designing the algorithms to solve the visual tasks necessary to properly behave into the 3D world. The novelty of our approach lies in the use of the VR as a tool to simulate the behavior of vision systems. The visual system of a robot (e.g., an autonomous vehicle, an active vision system, or a driving assistance system) and its interplay with the environment can be modeled through the geometrical relationships between the virtual stereo cameras and the virtual 3D world. Differently from conventional applications, where VR is used for the perceptual rendering of the visual information to a human observer, in the proposed approach, a virtual world is rendered to simulate the actual projections on the cameras of a robotic system. In this way, machine vision algorithms can be quantitatively validated by using the ground truth data provided by the knowledge of both the structure of the environment and the vision system

    Learning bio-inspired head-centric representations of 3D shapes in an active fixation setting

    Get PDF
    When exploring the surrounding environment with the eyes, humans and primates need to interpret three-dimensional (3D) shapes in a fast and invariant way, exploiting a highly variant and gaze-dependent visual information. Since they have front-facing eyes, binocular disparity is a prominent cue for depth perception. Specifically, it serves as computational substrate for two ground mechanisms of binocular active vision: stereopsis and binocular coordination. To this aim, disparity information, which is expressed in a retinotopic reference frame, is combined along the visual cortical pathways with gaze information and transformed in a head-centric reference frame. Despite the importance of this mechanism, the underlying neural substrates still remain widely unknown. In this work, we investigate the capabilities of the human visual system to interpret the 3D scene exploiting disparity and gaze information. In a psychophysical experiment, human subjects were asked to judge the depth orientation of a planar surface either while fixating a target point or while freely exploring the surface. Moreover, we used the same stimuli to train a recurrent neural network to exploit the responses of a modelled population of cortical (V1) cells to interpret the 3D scene layout. The results for both human performance and from the model network show that integrating disparity information across gaze directions is crucial for a reliable and invariant interpretation of the 3D geometry of the scene

    Self-operated stimuli improve subsequent visual motion integration

    Get PDF
    Evidences of perceptual changes that accompany motor activity have been limited primarily to audition and somatosensation. Here we asked whether motor learning results in changes to visual motion perception. We designed a reaching task in which participants were trained to make movements along several directions, while the visual feedback was provided by an intrinsically ambiguous moving stimulus directly tied to hand motion. We find that training improves coherent motion perception and that changes in movement are correlated with perceptual changes. No perceptual changes are observed in passive training even when observers were provided with an explicit strategy to facilitate single motion perception. A Bayesian model suggests that movement training promotes the fine-tuning of the internal representation of stimulus geometry. These results emphasize the role of sensorimotor interaction in determining the persistent properties in space and time that define a percept

    Modelling Short-Latency Disparity-Vergence Eye Movements Under Dichoptic Unbalanced Stimulation

    Get PDF
    Vergence eye movements align the optical axes of our two eyes onto an object of interest, thus facilitating the binocular summation of the images projected onto the left and the right retinae into a single percept. Both the computational substrate and the functional behaviour of binocular vergence eye movements have been the topic of in depth investigation. Here, we attempt to bring together what is known about computation and function of vergence mechanism. To this aim, we evaluated of a biologically inspired model of horizontal and vertical vergence control, based on a network of V1 simple and complex cells. The model performances were compared to that of human observers, with dichoptic stimuli characterized by a varying amounts of interocular correlation, interocular contrast, and vertical disparity. The model provides a qualitative explanation of psychophysiological data. Nevertheless, human vergence response to interocular contrast differs from model’s behavior, suggesting that the proposed disparity-vergence model may be improved to account for human behavior. More than this, this observation also highlights how dichoptic unbalanced stimulation can be used to investigate the significant but neglected role of sensory processing in motor planning of eye movements in depth

    Phase-Based Binocular Perception of Motion in Depth: Cortical-Like Operators and Analog VLSI Architectures

    Get PDF
    We present a cortical-like strategy to obtain reliable estimates of the motions of objects in a scene toward/away from the observer (motion in depth), from local measurements of binocular parameters derived from direct comparison of the results of monocular spatiotemporal filtering operations performed on stereo image pairs. This approach is suitable for a hardware implementation, in which such parameters can be gained via a feedforward computation (i.e., collection, comparison, and punctual operations) on the outputs of the nodes of recurrent VLSI lattice networks, performing local computations. These networks act as efficient computational structures for embedded analog filtering operations in smart vision sensors. Extensive simulations on both synthetic and real-world image sequences prove the validity of the approach that allows to gain high-level information about the 3D structure of the scene, directly from sensorial data, without resorting to explicit scene reconstruction

    A hierarchical system for a distributed representation of the peripersonal space of a humanoid robot

    Get PDF
    Reaching a target object in an unknown and unstructured environment is easily performed by human beings. However, designing a humanoid robot that executes the same task requires the implementation of complex abilities, such as identifying the target in the visual field, estimating its spatial location, and precisely driving the motors of the arm to reach it. While research usually tackles the development of such abilities singularly, in this work we integrate a number of computational models into a unified framework, and demonstrate in a humanoid torso the feasibility of an integrated working representation of its peripersonal space. To achieve this goal, we propose a cognitive architecture that connects several models inspired by neural circuits of the visual, frontal and posterior parietal cortices of the brain. The outcome of the integration process is a system that allows the robot to create its internal model and its representation of the surrounding space by interacting with the environment directly, through a mutual adaptation of perception and action. The robot is eventually capable of executing a set of tasks, such as recognizing, gazing and reaching target objects, which can work separately or cooperate for supporting more structured and effective behaviors

    Vector Disparity Sensor with Vergence Control for Active Vision Systems

    Get PDF
    This paper presents an architecture for computing vector disparity for active vision systems as used on robotics applications. The control of the vergence angle of a binocular system allows us to efficiently explore dynamic environments, but requires a generalization of the disparity computation with respect to a static camera setup, where the disparity is strictly 1-D after the image rectification. The interaction between vision and motor control allows us to develop an active sensor that achieves high accuracy of the disparity computation around the fixation point, and fast reaction time for the vergence control. In this contribution, we address the development of a real-time architecture for vector disparity computation using an FPGA device. We implement the disparity unit and the control module for vergence, version, and tilt to determine the fixation point. In addition, two on-chip different alternatives for the vector disparity engines are discussed based on the luminance (gradient-based) and phase information of the binocular images. The multiscale versions of these engines are able to estimate the vector disparity up to 32 fps on VGA resolution images with very good accuracy as shown using benchmark sequences with known ground-truth. The performances in terms of frame-rate, resource utilization, and accuracy of the presented approaches are discussed. On the basis of these results, our study indicates that the gradient-based approach leads to the best trade-off choice for the integration with the active vision system
    • 

    corecore