377,468 research outputs found

    Towards binocular active vision in a robot head system

    Get PDF
    This paper presents the first results of an investigation and pilot study into an active, binocular vision system that combines binocular vergence, object recognition and attention control in a unified framework. The prototype developed is capable of identifying, targeting, verging on and recognizing objects in a highly-cluttered scene without the need for calibration or other knowledge of the camera geometry. This is achieved by implementing all image analysis in a symbolic space without creating explicit pixel-space maps. The system structure is based on the ‘searchlight metaphor’ of biological systems. We present results of a first pilot investigation that yield a maximum vergence error of 6.4 pixels, while seven of nine known objects were recognized in a high-cluttered environment. Finally a “stepping stone” visual search strategy was demonstrated, taking a total of 40 saccades to find two known objects in the workspace, neither of which appeared simultaneously within the Field of View resulting from any individual saccade

    Cognitive visual tracking and camera control

    Get PDF
    Cognitive visual tracking is the process of observing and understanding the behaviour of a moving person. This paper presents an efficient solution to extract, in real-time, high-level information from an observed scene, and generate the most appropriate commands for a set of pan-tilt-zoom (PTZ) cameras in a surveillance scenario. Such a high-level feedback control loop, which is the main novelty of our work, will serve to reduce uncertainties in the observed scene and to maximize the amount of information extracted from it. It is implemented with a distributed camera system using SQL tables as virtual communication channels, and Situation Graph Trees for knowledge representation, inference and high-level camera control. A set of experiments in a surveillance scenario show the effectiveness of our approach and its potential for real applications of cognitive vision

    Analysis of mobile laser scanning data and multi-view image reconstruction

    Get PDF
    The combination of laser scanning (LS, active, direct 3D measurement of the object surface) and photogrammetry (high geometric and radiometric resolution) is widely applied for object reconstruction (e.g. architecture, topography, monitoring, archaeology). Usually the results are a coloured point cloud or a textured mesh. The geometry is typically generated from the laser scanning point cloud and the radiometric information is the result of image acquisition. In the last years, next to significant developments in static (terrestrial LS) and kinematic LS (airborne and mobile LS) hardware and software, research in computer vision and photogrammetry lead to advanced automated procedures in image orientation and image matching. These methods allow a highly automated generation of 3D geometry just based on image data. Founded on advanced feature detector techniques (like SIFT (Scale Invariant Feature Transform)) very robust techniques for image orientation were established (cf. Bundler). In a subsequent step, dense multi-view stereo reconstruction algorithms allow the generation of very dense 3D point clouds that represent the scene geometry (cf. Patch-based Multi-View Stereo (PMVS2)). Within this paper the usage of mobile laser scanning (MLS) and simultaneously acquired image data for an advanced integrated scene reconstruction is studied. For the analysis the geometry of a scene is generated by both techniques independently. Then, the paper focuses on the quality assessment of both techniques. This includes a quality analysis of the individual surface models and a comparison of the direct georeferencing of the images using positional and orientation data of the on board GNSS-INS system and the indirect georeferencing of the imagery by automatic image orientation. For the practical evaluation a dataset from an archaeological monument is utilised. Based on the gained knowledge a discussion of the results is provided and a future strategy for the integration of both techniques is proposed

    Developmental Stages of Perception and Language Acquisition in a Perceptually Grounded Robot

    Get PDF
    The objective of this research is to develop a system for language learning based on a minimum of pre-wired language-specific functionality, that is compatible with observations of perceptual and language capabilities in the human developmental trajectory. In the proposed system, meaning (in terms of descriptions of events and spatial relations) is extracted from video images based on detection of position, motion, physical contact and their parameters. Mapping of sentence form to meaning is performed by learning grammatical constructions that are retrieved from a construction inventory based on the constellation of closed class items uniquely identifying the target sentence structure. The resulting system displays robust acquisition behavior that reproduces certain observations from developmental studies, with very modest “innate” language specificity
    corecore