32 research outputs found
Segmenting Foreground Objects from a Dynamic Textured Background via a Robust Kalman Filter
The algorithm presented in this paper aims to segment the foreground objects in video (e.g., people) given time-varying, textured backgrounds. Examples of time-varying backgrounds include waves on water, clouds moving, trees waving in the wind, automobile traffic, moving crowds, escalators, etc. We have developed a novel foreground-background segmentation algorithm that explicitly accounts for the non-stationary nature and clutter-like appearance of many dynamic textures. The dynamic texture is modeled by an Autoregressive Moving Average Model (ARMA). A robust Kalman filter algorithm iteratively estimates the intrinsic appearance of the dynamic texture, as well as the regions of the foreground objects. Preliminary experiments with this method have demonstrated promising results
Statistical Analysis of Dynamic Actions
Real-world action recognition applications require the development of systems which are fast, can handle a large variety of actions without a priori knowledge of the type of actions, need a minimal number of parameters, and necessitate as short as possible learning stage. In this paper, we suggest such an approach. We regard dynamic activities as long-term temporal objects, which are characterized by spatio-temporal features at multiple temporal scales. Based on this, we design a simple statistical distance measure between video sequences which captures the similarities in their behavioral content. This measure is nonparametric and can thus handle a wide range of complex dynamic actions. Having a behavior-based distance measure between sequences, we use it for a variety of tasks, including: video indexing, temporal segmentation, and action-based video clustering. These tasks are performed without prior knowledge of the types of actions, their models, or their temporal extents
Periodic Motion Detection and Estimation via Space-Time Sampling
A novel technique to detect and localize periodic movements in video is presented. The distinctive feature of the technique is that it requires neither feature tracking nor object segmentation. Intensity patterns along linear sample paths in space-time are used in estimation of period of object motion in a given sequence of frames. Sample paths are obtained by connecting (in space-time) sample points from regions of high motion magnitude in the first and last frames. Oscillations in intensity values are induced at time instants when an object intersects the sample path. The locations of peaks in intensity are determined by parameters of both cyclic object motion and orientation of the sample path with respect to object motion. The information about peaks is used in a least squares framework to obtain an initial estimate of these parameters. The estimate is further refined using the full intensity profile. The best estimate for the period of cyclic object motion is obtained by looking for consensus among estimates from many sample paths. The proposed technique is evaluated with synthetic videos where ground-truth is known, and with American Sign Language videos where the goal is to detect periodic hand motions.National Science Foundation (CNS-0202067, IIS-0308213, IIS-0329009); Office of Naval Research (N00014-03-1-0108
Real-World Repetition Estimation by Div, Grad and Curl
We consider the problem of estimating repetition in video, such as performing
push-ups, cutting a melon or playing violin. Existing work shows good results
under the assumption of static and stationary periodicity. As realistic video
is rarely perfectly static and stationary, the often preferred Fourier-based
measurements is inapt. Instead, we adopt the wavelet transform to better handle
non-static and non-stationary video dynamics. From the flow field and its
differentials, we derive three fundamental motion types and three motion
continuities of intrinsic periodicity in 3D. On top of this, the 2D perception
of 3D periodicity considers two extreme viewpoints. What follows are 18
fundamental cases of recurrent perception in 2D. In practice, to deal with the
variety of repetitive appearance, our theory implies measuring time-varying
flow and its differentials (gradient, divergence and curl) over segmented
foreground motion. For experiments, we introduce the new QUVA Repetition
dataset, reflecting reality by including non-static and non-stationary videos.
On the task of counting repetitions in video, we obtain favorable results
compared to a deep learning alternative
Temporal task allocation in periodic environments. An approach based on synchronization
In this paper, we study a robot swarm that has to perform task allocation in an environment that features periodic properties. In this environment, tasks appear in different areas following periodic temporal patterns. The swarm has to reallocate its workforce periodically, performing a temporal task allocation that must be synchronized with the environment to be effective.
We tackle temporal task allocation using methods and concepts that we borrow from the signal processing literature. In particular, we propose a distributed temporal task allocation algorithm that synchronizes robots of the swarm with the environment and with each other. In this algorithm, robots use only local information and a simple visual communication protocol based on light blinking. Our results show that a robot swarm that uses the proposed temporal task allocation algorithm performs considerably more tasks than a swarm that uses a greedy algorithm
An HMM-Based Framework for Video Semantic Analysis
Video semantic analysis is essential in video indexing and structuring. However, due to the lack of robust and generic algorithms, most of the existing works on semantic analysis are limited to specific domains. In this paper, we present a novel hidden Markove model (HMM)-based framework as a general solution to video semantic analysis. In the proposed framework, semantics in different granularities are mapped to a hierarchical model space, which is composed of detectors and connectors. In this manner, our model decomposes a complex analysis problem into simpler subproblems during the training process and automatically integrates those subproblems for recognition. The proposed framework is not only suitable for a broad range of applications, but also capable of modeling semantics in different semantic granularities. Additionally, we also present a new motion representation scheme, which is robust to different motion vector sources. The applications of the proposed framework in basketball event detection, soccer shot classification, and volleyball sequence analysis have demonstrated the effectiveness of the proposed framework on video semantic analysis