11,146 research outputs found
Region based analysis of video sequences with a general merging algorithm
Connected operators [4] and Region Growing [2] algorithms have been created in different context and applications. However, they all are based on the same fundamental merging process. This paper discusses the basic issues of the merging algorithm and presents different applications ranging from simple frame segmentation to video sequence analysis.Peer ReviewedPostprint (published version
Multi-Scale 3D Scene Flow from Binocular Stereo Sequences
Scene flow methods estimate the three-dimensional motion field for points in the world, using multi-camera video data. Such methods combine multi-view reconstruction with motion estimation. This paper describes an alternative formulation for dense scene flow estimation that provides reliable results using only two cameras by fusing stereo and optical flow estimation into a single coherent framework. Internally, the proposed algorithm generates probability distributions for optical flow and disparity. Taking into account the uncertainty in the intermediate stages allows for more reliable estimation of the 3D scene flow than previous methods allow. To handle the aperture problems inherent in the estimation of optical flow and disparity, a multi-scale method along with a novel region-based technique is used within a regularized solution. This combined approach both preserves discontinuities and prevents over-regularization – two problems commonly associated with the basic multi-scale approaches. Experiments with synthetic and real test data demonstrate the strength of the proposed approach.National Science Foundation (CNS-0202067, IIS-0208876); Office of Naval Research (N00014-03-1-0108
Spatio-temporal Video Parsing for Abnormality Detection
Abnormality detection in video poses particular challenges due to the
infinite size of the class of all irregular objects and behaviors. Thus no (or
by far not enough) abnormal training samples are available and we need to find
abnormalities in test data without actually knowing what they are.
Nevertheless, the prevailing concept of the field is to directly search for
individual abnormal local patches or image regions independent of another. To
address this problem, we propose a method for joint detection of abnormalities
in videos by spatio-temporal video parsing. The goal of video parsing is to
find a set of indispensable normal spatio-temporal object hypotheses that
jointly explain all the foreground of a video, while, at the same time, being
supported by normal training samples. Consequently, we avoid a direct detection
of abnormalities and discover them indirectly as those hypotheses which are
needed for covering the foreground without finding an explanation for
themselves by normal samples. Abnormalities are localized by MAP inference in a
graphical model and we solve it efficiently by formulating it as a convex
optimization problem. We experimentally evaluate our approach on several
challenging benchmark sets, improving over the state-of-the-art on all standard
benchmarks both in terms of abnormality classification and localization.Comment: 15 pages, 12 figures, 3 table
Statistical Model of Shape Moments with Active Contour Evolution for Shape Detection and Segmentation
This paper describes a novel method for shape representation and robust image segmentation. The proposed method combines two well known methodologies, namely, statistical shape models and active contours implemented in level set framework. The shape detection is achieved by maximizing a posterior function that consists of a prior shape probability model and image likelihood function conditioned on shapes. The statistical shape model is built as a result of a learning process based on nonparametric probability estimation in a PCA reduced feature space formed by the Legendre moments of training silhouette images. A greedy strategy is applied to optimize the proposed cost function by iteratively evolving an implicit active contour in the image space and subsequent constrained optimization of the evolved shape in the reduced shape feature space. Experimental results presented in the paper demonstrate that the proposed method, contrary to many other active contour segmentation methods, is highly resilient to severe random and structural noise that could be present in the data
Learning the dynamics and time-recursive boundary detection of deformable objects
We propose a principled framework for recursively segmenting deformable objects across a sequence
of frames. We demonstrate the usefulness of this method on left ventricular segmentation across a cardiac
cycle. The approach involves a technique for learning the system dynamics together with methods of
particle-based smoothing as well as non-parametric belief propagation on a loopy graphical model capturing
the temporal periodicity of the heart. The dynamic system state is a low-dimensional representation
of the boundary, and the boundary estimation involves incorporating curve evolution into recursive state
estimation. By formulating the problem as one of state estimation, the segmentation at each particular
time is based not only on the data observed at that instant, but also on predictions based on past and future
boundary estimates. Although the paper focuses on left ventricle segmentation, the method generalizes
to temporally segmenting any deformable object
Moving-edge detection via heat flow analogy
In this paper, a new and automatic moving-edge detection algorithm is proposed, based on using the heat flow analogy. This algorithm starts with anisotropic heat diffusion in the spatial domain, to remove noise and sharpen region boundaries for the purpose of obtaining high quality edge data. Then, isotropic and linear heat diffusion is applied in the temporal domain to calculate the total amount of heat flow. The moving-edges are represented as the total amount of heat flow out from the reference frame. The overall process is completed by non-maxima suppression and hysteresis thresholding to obtain binary moving edges. Evaluation, on a variety of data, indicates that this approach can handle noise in the temporal domain because of the averaging inherent of isotropic heat flow. Results also show that this technique can detect moving-edges in image sequences, without background image subtraction
Multimodal music information processing and retrieval: survey and future challenges
Towards improving the performance in various music information processing
tasks, recent studies exploit different modalities able to capture diverse
aspects of music. Such modalities include audio recordings, symbolic music
scores, mid-level representations, motion, and gestural data, video recordings,
editorial or cultural tags, lyrics and album cover arts. This paper critically
reviews the various approaches adopted in Music Information Processing and
Retrieval and highlights how multimodal algorithms can help Music Computing
applications. First, we categorize the related literature based on the
application they address. Subsequently, we analyze existing information fusion
approaches, and we conclude with the set of challenges that Music Information
Retrieval and Sound and Music Computing research communities should focus in
the next years
- …