1,247 research outputs found

    Semantic Instance Annotation of Street Scenes by 3D to 2D Label Transfer

    Full text link
    Semantic annotations are vital for training models for object recognition, semantic segmentation or scene understanding. Unfortunately, pixelwise annotation of images at very large scale is labor-intensive and only little labeled data is available, particularly at instance level and for street scenes. In this paper, we propose to tackle this problem by lifting the semantic instance labeling task from 2D into 3D. Given reconstructions from stereo or laser data, we annotate static 3D scene elements with rough bounding primitives and develop a model which transfers this information into the image domain. We leverage our method to obtain 2D labels for a novel suburban video dataset which we have collected, resulting in 400k semantic and instance image annotations. A comparison of our method to state-of-the-art label transfer baselines reveals that 3D information enables more efficient annotation while at the same time resulting in improved accuracy and time-coherent labels.Comment: 10 pages in Conference on Computer Vision and Pattern Recognition (CVPR), 201

    Under vehicle perception for high level safety measures using a catadioptric camera system

    Get PDF
    In recent years, under vehicle surveillance and the classification of the vehicles become an indispensable task that must be achieved for security measures in certain areas such as shopping centers, government buildings, army camps etc. The main challenge to achieve this task is to monitor the under frames of the means of transportations. In this paper, we present a novel solution to achieve this aim. Our solution consists of three main parts: monitoring, detection and classification. In the first part we design a new catadioptric camera system in which the perspective camera points downwards to the catadioptric mirror mounted to the body of a mobile robot. Thanks to the catadioptric mirror the scenes against the camera optical axis direction can be viewed. In the second part we use speeded up robust features (SURF) in an object recognition algorithm. Fast appearance based mapping algorithm (FAB-MAP) is exploited for the classification of the means of transportations in the third part. Proposed technique is implemented in a laboratory environment

    Occlusion Handling using Semantic Segmentation and Visibility-Based Rendering for Mixed Reality

    Full text link
    Real-time occlusion handling is a major problem in outdoor mixed reality system because it requires great computational cost mainly due to the complexity of the scene. Using only segmentation, it is difficult to accurately render a virtual object occluded by complex objects such as trees, bushes etc. In this paper, we propose a novel occlusion handling method for real-time, outdoor, and omni-directional mixed reality system using only the information from a monocular image sequence. We first present a semantic segmentation scheme for predicting the amount of visibility for different type of objects in the scene. We also simultaneously calculate a foreground probability map using depth estimation derived from optical flow. Finally, we combine the segmentation result and the probability map to render the computer generated object and the real scene using a visibility-based rendering method. Our results show great improvement in handling occlusions compared to existing blending based methods

    OmniDepth: Dense Depth Estimation for Indoors Spherical Panoramas.

    Get PDF
    Recent work on depth estimation up to now has only focused on projective images ignoring 360o content which is now increasingly and more easily produced. We show that monocular depth estimation models trained on traditional images produce sub-optimal results on omnidirectional images, showcasing the need for training directly on 360o datasets, which however, are hard to acquire. In this work, we circumvent the challenges associated with acquiring high quality 360o datasets with ground truth depth annotations, by re-using recently released large scale 3D datasets and re-purposing them to 360o via rendering. This dataset, which is considerably larger than similar projective datasets, is publicly offered to the community to enable future research in this direction. We use this dataset to learn in an end-to-end fashion the task of depth estimation from 360o images. We show promising results in our synthesized data as well as in unseen realistic images

    OmniSCV: An omnidirectional synthetic image generator for computer vision

    Get PDF
    Omnidirectional and 360Âş images are becoming widespread in industry and in consumer society, causing omnidirectional computer vision to gain attention. Their wide field of view allows the gathering of a great amount of information about the environment from only an image. However, the distortion of these images requires the development of specific algorithms for their treatment and interpretation. Moreover, a high number of images is essential for the correct training of computer vision algorithms based on learning. In this paper, we present a tool for generating datasets of omnidirectional images with semantic and depth information. These images are synthesized from a set of captures that are acquired in a realistic virtual environment for Unreal Engine 4 through an interface plugin. We gather a variety of well-known projection models such as equirectangular and cylindrical panoramas, different fish-eye lenses, catadioptric systems, and empiric models. Furthermore, we include in our tool photorealistic non-central-projection systems as non-central panoramas and non-central catadioptric systems. As far as we know, this is the first reported tool for generating photorealistic non-central images in the literature. Moreover, since the omnidirectional images are made virtually, we provide pixel-wise information about semantics and depth as well as perfect knowledge of the calibration parameters of the cameras. This allows the creation of ground-truth information with pixel precision for training learning algorithms and testing 3D vision approaches. To validate the proposed tool, different computer vision algorithms are tested as line extractions from dioptric and catadioptric central images, 3D Layout recovery and SLAM using equirectangular panoramas, and 3D reconstruction from non-central panoramas
    • …
    corecore