4,035 research outputs found

    Vision based obstacle detection for all-terrain robots

    Get PDF
    Dissertação apresentada na Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa para obtenção do grau de Mestre em Engenharia Electrotécnica e de ComputadoresThis dissertation presents a solution to the problem of obstacle detection in all-terrain environments,with particular interest for mobile robots equipped with a stereo vision sensor. Despite the advantages of vision, over other kind of sensors, such as low cost, light weight and reduced energetic footprint, its usage still presents a series of challenges. These include the difficulty in dealing with the considerable amount of generated data, and the robustness required to manage high levels of noise. Such problems can be diminished by making hard assumptions, like considering that the terrain in front of the robot is planar. Although computation can be considerably saved, such simplifications are not necessarily acceptable in more complex environments, where the terrain may be considerably uneven. This dissertation proposes to extend a well known obstacle detector that relaxes the aforementioned planar terrain assumption, thus rendering it more adequate for unstructured environments. The proposed extensions involve: (1) the introduction of a visual saliency mechanism to focus the detection in regions most likely to contain obstacles; (2) voting filters to diminish sensibility to noise; and (3) the fusion of the detector with a complementary method to create a hybrid solution, and thus, more robust. Experimental results obtained with demanding all-terrain images show that, with the proposed extensions, an increment in terms of robustness and computational efficiency over the original algorithm is observe

    How is Gaze Influenced by Image Transformations? Dataset and Model

    Full text link
    Data size is the bottleneck for developing deep saliency models, because collecting eye-movement data is very time consuming and expensive. Most of current studies on human attention and saliency modeling have used high quality stereotype stimuli. In real world, however, captured images undergo various types of transformations. Can we use these transformations to augment existing saliency datasets? Here, we first create a novel saliency dataset including fixations of 10 observers over 1900 images degraded by 19 types of transformations. Second, by analyzing eye movements, we find that observers look at different locations over transformed versus original images. Third, we utilize the new data over transformed images, called data augmentation transformation (DAT), to train deep saliency models. We find that label preserving DATs with negligible impact on human gaze boost saliency prediction, whereas some other DATs that severely impact human gaze degrade the performance. These label preserving valid augmentation transformations provide a solution to enlarge existing saliency datasets. Finally, we introduce a novel saliency model based on generative adversarial network (dubbed GazeGAN). A modified UNet is proposed as the generator of the GazeGAN, which combines classic skip connections with a novel center-surround connection (CSC), in order to leverage multi level features. We also propose a histogram loss based on Alternative Chi Square Distance (ACS HistLoss) to refine the saliency map in terms of luminance distribution. Extensive experiments and comparisons over 3 datasets indicate that GazeGAN achieves the best performance in terms of popular saliency evaluation metrics, and is more robust to various perturbations. Our code and data are available at: https://github.com/CZHQuality/Sal-CFS-GAN

    3D scanning of cultural heritage with consumer depth cameras

    Get PDF
    Three dimensional reconstruction of cultural heritage objects is an expensive and time-consuming process. Recent consumer real-time depth acquisition devices, like Microsoft Kinect, allow very fast and simple acquisition of 3D views. However 3D scanning with such devices is a challenging task due to the limited accuracy and reliability of the acquired data. This paper introduces a 3D reconstruction pipeline suited to use consumer depth cameras as hand-held scanners for cultural heritage objects. Several new contributions have been made to achieve this result. They include an ad-hoc filtering scheme that exploits the model of the error on the acquired data and a novel algorithm for the extraction of salient points exploiting both depth and color data. Then the salient points are used within a modified version of the ICP algorithm that exploits both geometry and color distances to precisely align the views even when geometry information is not sufficient to constrain the registration. The proposed method, although applicable to generic scenes, has been tuned to the acquisition of sculptures and in this connection its performance is rather interesting as the experimental results indicate
    • …
    corecore