198 research outputs found

    A spatially distributed model for foreground segmentation

    Get PDF
    Foreground segmentation is a fundamental first processing stage for vision systems which monitor real-world activity. In this paper we consider the problem of achieving robust segmentation in scenes where the appearance of the background varies unpredictably over time. Variations may be caused by processes such as moving water, or foliage moved by wind, and typically degrade the performance of standard per-pixel background models. Our proposed approach addresses this problem by modeling homogeneous regions of scene pixels as an adaptive mixture of Gaussians in color and space. Model components are used to represent both the scene background and moving foreground objects. Newly observed pixel values are probabilistically classified, such that the spatial variance of the model components supports correct classification even when the background appearance is significantly distorted. We evaluate our method over several challenging video sequences, and compare our results with both per-pixel and Markov Random Field based models. Our results show the effectiveness of our approach in reducing incorrect classifications

    Automatic Action Annotation in Weakly Labeled Videos

    Full text link
    Manual spatio-temporal annotation of human action in videos is laborious, requires several annotators and contains human biases. In this paper, we present a weakly supervised approach to automatically obtain spatio-temporal annotations of an actor in action videos. We first obtain a large number of action proposals in each video. To capture a few most representative action proposals in each video and evade processing thousands of them, we rank them using optical flow and saliency in a 3D-MRF based framework and select a few proposals using MAP based proposal subset selection method. We demonstrate that this ranking preserves the high quality action proposals. Several such proposals are generated for each video of the same action. Our next challenge is to iteratively select one proposal from each video so that all proposals are globally consistent. We formulate this as Generalized Maximum Clique Graph problem using shape, global and fine grained similarity of proposals across the videos. The output of our method is the most action representative proposals from each video. Our method can also annotate multiple instances of the same action in a video. We have validated our approach on three challenging action datasets: UCF Sport, sub-JHMDB and THUMOS'13 and have obtained promising results compared to several baseline methods. Moreover, on UCF Sports, we demonstrate that action classifiers trained on these automatically obtained spatio-temporal annotations have comparable performance to the classifiers trained on ground truth annotation

    Detection and segmentation of moving objects in complex scenes

    Get PDF
    International audienceIn this paper, we address the difficult task of detecting and segmenting foreground moving objects in complex scenes. The sequences we consider exhibit highly dynamic backgrounds, illumination changes and low contrasts, and can have been shot by a moving camera. Three main steps compose the proposed method. First, a set of moving points is selected within a sub-grid of image pixels. A multi-cue descriptor is associated to each of these points. Clusters of points are then formed using a variable bandwidth mean shift technique with automatic bandwidth selection. Finally, segmentation of the object associated to a given cluster is performed using graph cuts. Experiments and comparisons to other motion detection methods on challenging sequences demonstrate the performance of the proposed method for video analysis in complex scenes

    Selective subtraction for handheld cameras

    Get PDF
    © 2013 IEEE. Background subtraction techniques model the background of the scene using the stationarity property and classify the scene into two classes namely foreground and background. In doing so, most moving objects become foreground indiscriminately, except in dynamic scenes (such as those with some waving tree leaves, water ripples, or a water fountain), which are typically \u27learned\u27 as part of the background using a large training set of video data. We introduce a novel concept of background as the objects other than the foreground, which may include moving objects in the scene that cannot be learned from a training set because they occur only irregularly and sporadically, e.g. a walking person. We propose a \u27selective subtraction\u27 method as an alternative to standard background subtraction, and show that a reference plane in a scene viewed by two cameras can be used as the decision boundary between foreground and background. In our definition, the foreground may actually occur behind a moving object. Furthermore, the reference plane can be selected in a very flexible manner, using for example the actual moving objects in the scene, if needed. We extend this idea to allow multiple reference planes resulting in multiple foregrounds or backgrounds. We present diverse set of examples to show that: 1) the technique performs better than standard background subtraction techniques without the need for training, camera calibration, disparity map estimation, or special camera configurations; 2) it is potentially more powerful than standard methods because of its flexibility of making it possible to select in real-time what to filter out as background, regardless of whether the object is moving or not, or whether it is a rare event or a frequent one. Furthermore, we show that this technique is relatively immune to camera motion and performs well for hand-held cameras
    corecore