24 research outputs found

    Accurate foreground segmentation without pre-learning

    Get PDF
    Foreground segmentation has been widely used in many computer vision applications. However, most of the existing methods rely on a pre-learned motion or background model, which will increase the burden of users. In this paper, we present an automatic algorithm without pre-learning for segmenting foreground from background based on the fusion of motion, color and contrast information. Motion information is enhanced by a novel method called support edges diffusion (SED) , which is built upon a key observation that edges of the difference image of two adjacent frames only appear in moving regions in most of the cases. Contrasts in background are attenuated while those in foreground are enhanced using gradient of the previous frame and that of the temporal difference. Experiments on many video sequences demonstrate the effectiveness and accuracy of the proposed algorithm. The segmentation results are comparable to those obtained by other state-of-the-art methods that depend on a pre-learned background or a stereo setup. © 2011 IEEE.published_or_final_versionThe 6th International Conference on Image and Graphics (ICIG 2011), Hefei, Anhui, China, 12-15 August 2011. In Proceedings of the 6th ICIG, 2011, p. 331-33

    DepthCut: Improved Depth Edge Estimation Using Multiple Unreliable Channels

    Get PDF
    In the context of scene understanding, a variety of methods exists to estimate different information channels from mono or stereo images, including disparity, depth, and normals. Although several advances have been reported in the recent years for these tasks, the estimated information is often imprecise particularly near depth discontinuities or creases. Studies have however shown that precisely such depth edges carry critical cues for the perception of shape, and play important roles in tasks like depth-based segmentation or foreground selection. Unfortunately, the currently extracted channels often carry conflicting signals, making it difficult for subsequent applications to effectively use them. In this paper, we focus on the problem of obtaining high-precision depth edges (i.e., depth contours and creases) by jointly analyzing such unreliable information channels. We propose DepthCut, a data-driven fusion of the channels using a convolutional neural network trained on a large dataset with known depth. The resulting depth edges can be used for segmentation, decomposing a scene into depth layers with relatively flat depth, or improving the accuracy of the depth estimate near depth edges by constraining its gradients to agree with these edges. Quantitatively, we compare against 15 variants of baselines and demonstrate that our depth edges result in an improved segmentation performance and an improved depth estimate near depth edges compared to data-agnostic channel fusion. Qualitatively, we demonstrate that the depth edges result in superior segmentation and depth orderings.Comment: 12 page

    Foreground Silhouette Extraction robust to Sudden Changes of background Appearance

    Get PDF
    Vision-based background subtraction algorithms model the intensity variation across time to classify a pixel as foreground. Unfortunately, such algorithms are sensitive to appearance changes of the background such as sudden changes of illumination or when videos are projected in the background. In this work, we propose an algorithm to extract foreground silhouettes without modeling the intensity variation across time. Using a camera pair, the stereo mismatch is processed to produce a dense disparity based on a Total Variation (TV) framework. Experimental results show that with sudden changes of background appearance, our proposed TV disparity-based extraction outperforms intensity-based algorithms and existing stereo-based approaches based on temporal depth variation and stereo mismatch

    Background Subtraction Based on Color and Depth Using Active Sensors

    Get PDF
    Depth information has been used in computer vision for a wide variety of tasks. Since active range sensors are currently available at low cost, high-quality depth maps can be used as relevant input for many applications. Background subtraction and video segmentation algorithms can be improved by fusing depth and color inputs, which are complementary and allow one to solve many classic color segmentation issues. In this paper, we describe one fusion method to combine color and depth based on an advanced color-based algorithm. This technique has been evaluated by means of a complete dataset recorded with Microsoft Kinect, which enables comparison with the original method. The proposed method outperforms the others in almost every test, showing more robustness to illumination changes, shadows, reflections and camouflage.This work was supported by the projects of excellence from Junta de Andalucia MULTIVISION (TIC-3873), ITREBA (TIC-5060) and VITVIR (P11-TIC-8120), the national project, ARC-VISION (TEC2010-15396), and the EU Project, TOMSY (FP7-270436)

    Human tracking and segmentation supported by silhouette-based gait recognition

    Full text link
    Abstract — Gait recognition has recently gained attention as an effective approach to identify individuals at a distance from a camera. Most existing gait recognition algorithms assume that people have been tracked and silhouettes have been segmented successfully. Tacking and segmentation are, however, very difficult especially for articulated objects such as human beings. Therefore, we present an integrated algorithm for tracking and segmentation supported by gait recognition. After the tracking module produces initial results consisting of bounding boxes and foreground likelihood images, the gait recognition module searches for the optimal silhouette-based gait models corresponding to the results. Then, the segmentation module tries to segment people out using the provided gait silhouette sequence as shape priors. Experiments on real video sequences show the effectiveness of the proposed approach. I

    Pose Estimation and Segmentation of Multiple People in Stereoscopic Movies

    Get PDF
    International audienceWe describe a method to obtain a pixel-wise segmentation and pose estimation of multiple people in stereoscopic videos. This task involves challenges such as dealing with unconstrained stereoscopic video, non-stationary cameras, and complex indoor and outdoor dynamic scenes with multiple people. We cast the problem as a discrete labelling task involving multiple person labels, devise a suitable cost function, and optimize it efficiently. The contributions of our work are two-fold: First, we develop a segmentation model incorporating person detections and learnt articulated pose segmentation masks, as well as colour, motion, and stereo disparity cues. The model also explicitly represents depth ordering and occlusion. Second, we introduce a stereoscopic dataset with frames extracted from feature-length movies "StreetDance 3D" and "Pina". The dataset contains 587 annotated human poses, 1158 bounding box annotations and 686 pixel-wise segmentations of people. The dataset is composed of indoor and outdoor scenes depicting multiple people with frequent occlusions. We demonstrate results on our new challenging dataset, as well as on the H2view dataset from (Sheasby et al. ACCV 2012)

    Developing interactions in augmented materiality: an enhancement method based on RGB-D segmentation

    Get PDF
    Knowledge and understanding of how Augmented Reality develops the illusion of an alternative reality that needs to be critically considered with philosophical and technical points of view. Researchers have been investigating exploring different techniques to generate enhanced experiences for the users. In this article, the technological reality scenarios embodied within several Augmented Reality techniques are explored and a classification scheme is proposed in detail. Additionally, for the purposes of enhancing the cohesion of augmented visual content to the actuality, an Augmented Reality software based on a segmentation using RGB-D camera system that handles the occlusion problem will also be explained and an enhancement method will be discussed

    Object of interest extraction in low-frame-rate image sequences and application to mobile mapping systems

    Get PDF
    Here, we present a novel object of interest (OOI) extraction framework designed for low-frame-rate (LFR) image sequences, typically from mobile mapping systems (MMS). The proposed method integrates tracking and segmentation in a unified framework. We propose a novel object-shaped kernel-based scale-invariant mean shift algorithm to track the OOI through the LFR sequences and keep the temporal consistency. Then the well-known GrabCut approach for static image segmentation is generalized to the LFR sequences. We analyze the imaging geometry of the OOI in LFR sequences collected by the MMS and design a Kalman filter module to assist the proposed tracker. Extensive experimental results on real LFR sequences collected by VISAT (TM) MMS demonstrate that the proposed approach is robust to the challenges such as low frame rate, fast scaling, and large inter-frame displacement of the OOI. (C) 2012 Society of Photo-Optical Instrumentation Engineers (SPIE). [DOI: 10.1117/1.OE.51.6.067201]National Natural Science Foundation of China [40971245
    corecore