1,619 research outputs found

    Towards binocular active vision in a robot head system

    Get PDF
    This paper presents the first results of an investigation and pilot study into an active, binocular vision system that combines binocular vergence, object recognition and attention control in a unified framework. The prototype developed is capable of identifying, targeting, verging on and recognizing objects in a highly-cluttered scene without the need for calibration or other knowledge of the camera geometry. This is achieved by implementing all image analysis in a symbolic space without creating explicit pixel-space maps. The system structure is based on the ‘searchlight metaphor’ of biological systems. We present results of a first pilot investigation that yield a maximum vergence error of 6.4 pixels, while seven of nine known objects were recognized in a high-cluttered environment. Finally a “stepping stone” visual search strategy was demonstrated, taking a total of 40 saccades to find two known objects in the workspace, neither of which appeared simultaneously within the Field of View resulting from any individual saccade

    Alignment of Binocular-Binaural Data Using a Moving Audio-Visual Target

    Get PDF
    Best Paper AwardInternational audienceIn this paper we address the problem of aligning visual (V) and auditory (A) data using a sensor that is composed of a camera-pair and a microphone-pair. The original contribution of the paper is a method for AV data aligning through estimation of the 3D positions of the microphones in the visual-centred coordinate frame defined by the stereo camera-pair. We exploit the fact that these two distinct data sets are conditioned by a common set of parameters, namely the (unknown) 3D trajectory of an AV object, and derive an EM-like algorithm that alternates between the estimation of the microphone-pair position and the estimation of the AV object trajectory. The proposed algorithm has a number of built-in features: it can deal with A and V observations that are misaligned in time, it estimates the reliability of the data, it is robust to outliers in both modalities, and it has proven theoretical convergence. We report experiments with both simulated and real data

    Bilateral gain control; an "innate predisposition" for all sorts of things.

    Get PDF
    Empirical studies have revealed remarkable perceptual organization in neonates. Newborn behavioral distinctions have often been interpreted as implying functionally specific modular adaptations, and are widely cited as evidence supporting the nativist agenda. In this theoretical paper, we approach newborn perception and attention from an embodied, developmental perspective. At the mechanistic level, we argue that a generative mechanism based on mutual gain control between bilaterally corresponding points may underly a number of functionally defined "innate predispositions" related to spatial-configural perception. At the computational level, bilateral gain control implements beamforming, which enables spatial-configural tuning at the front end sampling stage. At the psychophysical level, we predict that selective attention in newborns will favor contrast energy which projects to bilaterally corresponding points on the neonate subject's sensor array. The current work extends and generalizes previous work to formalize the bilateral correlation model of newborn attention at a high level, and demonstrate in minimal agent-based simulations how bilateral gain control can enable a simple, robust and "social" attentional bias

    Beyond the Baseline: 3D Reconstruction of Tiny Objects with Single Camera Stereo Robot

    Get PDF
    open6noThis work was supported in part by the European Commission‘s Horizon2020 Framework Programme with the Project REMODEL—Robotic Technologies for the Manipulation of Complex Deformable Linear Objects under Grant 870133Self-aware robots rely on depth sensing to interact with the surrounding environment, e.g. to pursue object grasping. Yet, dealing with tiny items, often occurring in industrial robotics scenarios, may represent a challenge due to lack of sensors yielding sufficiently accurate depth measurements. Existing active sensors fail at measuring details of small objects (< 1cm) because of limitations in the working range, e.g. usually beyond 50 cm away, while off-the-shelf stereo cameras are not suited to close-range acquisitions due to the need for extremely short baselines. Therefore, we propose a framework designed for accurate depth sensing and particularly amenable to reconstruction of miniature objects. By leveraging on a single camera mounted in eye-on-hand configuration and the high repeatability of a robot, we acquire multiple images and process them through a stereo algorithm revised to fully exploit multiple vantage points. Using a novel dataset addressing performance evaluation in industrial applications, our Single camera Stereo Robot (SiSteR) delivers high accuracy even when dealing with miniature objects. We will provide a public dataset and an open-source implementation of our proposal to foster further development in this field.openDe Gregorio D.; Poggi M.; Zama Ramirez P..; Palli G.; Mattoccia S.; Di Stefano L.De Gregorio D.; Poggi M.; Zama Ramirez P.; Palli G.; Mattoccia S.; Di Stefano L

    Ranging of Aircraft Using Wide-baseline Stereopsis

    Get PDF
    The purpose of this research was to investigate the efficacy of wide-baseline stereopsis as a method of ranging aircraft, specifically as a possible sense-and-avoid solution in Unmanned Aerial Systems. Two studies were performed: the first was an experimental pilot study to examine the ability of humans to range in-flight aircraft and the second a wide-baseline study of stereopsis to range in-flight aircraft using a baseline 14.32 meters and two 640 x 480 pixel charge coupled device camera. An experimental research design was used in both studies. Humans in the pilot study ranged aircraft with a mean absolute error of 50.34%. The wide-baseline stereo system ranged aircraft within 2 kilometers with a mean absolute error of 17.62%. A t-test was performed and there was a significant difference between the mean absolute error of the humans in the pilot study and the wide-baseline stereo system. The results suggest that the wide-baseline system is more consistent as well as more accurate than humans

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world
    • 

    corecore