18,751 research outputs found
GrabCut-Based Human Segmentation in Video Sequences
In this paper, we present a fully-automatic Spatio-Temporal GrabCut human segmentation methodology that combines tracking and segmentation. GrabCut initialization is performed by a HOG-based subject detection, face detection, and skin color model. Spatial information is included by Mean Shift clustering whereas temporal coherence is considered by the historical of Gaussian Mixture Models. Moreover, full face and pose recovery is obtained by combining human segmentation with Active Appearance Models and Conditional Random Fields. Results over public datasets and in a new Human Limb dataset show a robust segmentation and recovery of both face and pose using the presented methodology
A spatially distributed model for foreground segmentation
Foreground segmentation is a fundamental first processing stage for vision systems which monitor real-world activity. In this paper we consider the problem of achieving robust segmentation in scenes where the appearance of the background varies unpredictably over time. Variations may be caused by processes such as moving water, or foliage moved by wind, and typically degrade the performance of standard per-pixel background models.
Our proposed approach addresses this problem by modeling homogeneous regions of scene pixels as an adaptive mixture of Gaussians in color and space. Model components are used to represent both the scene background and moving foreground objects. Newly observed pixel values are probabilistically classified, such that the spatial variance of the model components supports correct classification even when the background appearance is significantly distorted. We evaluate our method over several challenging video sequences, and compare our results with both per-pixel and Markov Random Field based models. Our results show the effectiveness of our approach in reducing incorrect classifications
Lip segmentation using adaptive color space training
In audio-visual speech recognition (AVSR), it is beneficial
to use lip boundary information in addition to texture-dependent
features. In this paper, we propose an automatic lip segmentation
method that can be used in AVSR systems. The algorithm
consists of the following steps: face detection, lip corners extraction,
adaptive color space training for lip and non-lip regions
using Gaussian mixture models (GMMs), and curve evolution
using level-set formulation based on region and image
gradients fields. Region-based fields are obtained using adapted
GMM likelihoods. We have tested the proposed algorithm on a
database (SU-TAV) of 100 facial images and obtained objective
performance results by comparing automatic lip segmentations
with hand-marked ground truth segmentations. Experimental
results are promising and much work has to be done to improve
the robustness of the proposed method
Tracking-Based Non-Parametric Background-Foreground Classification in a Chromaticity-Gradient Space
This work presents a novel background-foreground classification technique based on adaptive non-parametric kernel estimation in a color-gradient space of components. By combining normalized color components with their gradients, shadows are efficiently suppressed from the results, while the luminance information in the moving objects is preserved. Moreover, a fast multi-region iterative tracking strategy applied over previously detected foreground regions allows to construct a robust foreground modeling, which combined with the background model increases noticeably the quality in the detections. The proposed strategy has been applied to different kind of sequences, obtaining satisfactory results in complex situations such as those given by dynamic backgrounds, illumination changes, shadows and multiple moving objects
Event-Based Motion Segmentation by Motion Compensation
In contrast to traditional cameras, whose pixels have a common exposure time,
event-based cameras are novel bio-inspired sensors whose pixels work
independently and asynchronously output intensity changes (called "events"),
with microsecond resolution. Since events are caused by the apparent motion of
objects, event-based cameras sample visual information based on the scene
dynamics and are, therefore, a more natural fit than traditional cameras to
acquire motion, especially at high speeds, where traditional cameras suffer
from motion blur. However, distinguishing between events caused by different
moving objects and by the camera's ego-motion is a challenging task. We present
the first per-event segmentation method for splitting a scene into
independently moving objects. Our method jointly estimates the event-object
associations (i.e., segmentation) and the motion parameters of the objects (or
the background) by maximization of an objective function, which builds upon
recent results on event-based motion-compensation. We provide a thorough
evaluation of our method on a public dataset, outperforming the
state-of-the-art by as much as 10%. We also show the first quantitative
evaluation of a segmentation algorithm for event cameras, yielding around 90%
accuracy at 4 pixels relative displacement.Comment: When viewed in Acrobat Reader, several of the figures animate. Video:
https://youtu.be/0q6ap_OSBA
- …