24 research outputs found

    A Rao-Blackwellized Particle Filter for EigenTracking

    Get PDF
    ©2004 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.Presented at the 2004 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 27 June-2 July 2004, Washington, D.C.DOI: 10.1109/CVPR.2004.1315271Subspace representations have been a popular way to model appearance in computer vision. In Jepson and Black’s influential paper on EigenTracking, they were successfully applied in tracking. For noisy targets, optimization-based algorithms (including EigenTracking) often fail catastrophically after losing track. Particle filters have recently emerged as a robust method for tracking in the presence of multi-modal distributions. To use subspace representations in a particle filter, the number of samples increases exponentially as the state vector includes the subspace coefficients. We introduce an efficient method for using subspace representations in a particle filter by applying Rao-Blackwellization to integrate out the subspace coefficients in the state vector. Fewer samples are needed since part of the posterior over the state vector is analytically calculated. We use probabilistic principal component analysis to obtain analytically tractable integrals. We show experimental results in a scenario in which we track a target in clutter

    A Rao-Blackwellized Mixed State Particle Filter for Head Pose Tracking

    Get PDF
    This paper presents a Rao-Blackwellized mixed state particle filter for joint head tracking and pose estimation. Rao-Blackwellizing a particle filter consists of marginalizing some of the variables of the state space in order to exactly compute their posterior probability density function. Marginalizing variables reduces the dimension of the configuration space and makes the particle filter more efficient and requires a lower number of particles. Experiments were conducted on our head pose ground truth video database consisting of people engaged in meeting discussions. Results from these experiments demonstrated benefits of the Rao-Blackwellized particle filter model with fewer particles over the mixed state particle filter model

    Substructure and Boundary Modeling for Continuous Action Recognition

    Full text link
    This paper introduces a probabilistic graphical model for continuous action recognition with two novel components: substructure transition model and discriminative boundary model. The first component encodes the sparse and global temporal transition prior between action primitives in state-space model to handle the large spatial-temporal variations within an action class. The second component enforces the action duration constraint in a discriminative way to locate the transition boundaries between actions more accurately. The two components are integrated into a unified graphical structure to enable effective training and inference. Our comprehensive experimental results on both public and in-house datasets show that, with the capability to incorporate additional information that had not been explicitly or efficiently modeled by previous methods, our proposed algorithm achieved significantly improved performance for continuous action recognition.Comment: Detailed version of the CVPR 2012 paper. 15 pages, 6 figure

    Globally-Coordinated Locally-Linear Modeling of Multi-Dimensional Data

    Get PDF
    This thesis considers the problem of modeling and analysis of continuous, locally-linear, multi-dimensional spatio-temporal data. Our work extends the previously reported theoretical work on the global coordination model to temporal analysis of continuous, multi-dimensional data. We have developed algorithms for time-varying data analysis and used them in full-scale, real-world applications. The applications demonstrated in this thesis include tracking, synthesis, recognitions and retrieval of dynamic objects based on their shape, appearance and motion. The proposed approach in this thesis has advantages over existing approaches to analyzing complex spatio-temporal data. Experiments show that the new modeling features of our approach improve the performance of existing approaches in many applications. In object tracking, our approach is the first one to track nonlinear appearance variations by using low-dimensional representation of the appearance change in globally-coordinated linear subspaces. In dynamic texture synthesis, we are able to model non-stationary dynamic textures, which cannot be handled by any of the existing approaches. In human motion synthesis, we show that realistic synthesis can be performed without using specific transition points, or key frames

    Efficient illumination independent appearance-based face tracking

    Get PDF
    One of the major challenges that visual tracking algorithms face nowadays is being able to cope with changes in the appearance of the target during tracking. Linear subspace models have been extensively studied and are possibly the most popular way of modelling target appearance. We introduce a linear subspace representation in which the appearance of a face is represented by the addition of two approxi- mately independent linear subspaces modelling facial expressions and illumination respectively. This model is more compact than previous bilinear or multilinear ap- proaches. The independence assumption notably simplifies system training. We only require two image sequences. One facial expression is subject to all possible illumina- tions in one sequence and the face adopts all facial expressions under one particular illumination in the other. This simple model enables us to train the system with no manual intervention. We also revisit the problem of efficiently fitting a linear subspace-based model to a target image and introduce an additive procedure for solving this problem. We prove that Matthews and Baker’s Inverse Compositional Approach makes a smoothness assumption on the subspace basis that is equiva- lent to Hager and Belhumeur’s, which worsens convergence. Our approach differs from Hager and Belhumeur’s additive and Matthews and Baker’s compositional ap- proaches in that we make no smoothness assumptions on the subspace basis. In the experiments conducted we show that the model introduced accurately represents the appearance variations caused by illumination changes and facial expressions. We also verify experimentally that our fitting procedure is more accurate and has better convergence rate than the other related approaches, albeit at the expense of a slight increase in computational cost. Our approach can be used for tracking a human face at standard video frame rates on an average personal computer

    Dependent multiple cue integration for robust tracking

    Get PDF
    We propose a new technique for fusing multiple cues to robustly segment an object from its background in video sequences that suffer from abrupt changes of both illumination and position of the target. Robustness is achieved by the integration of appearance and geometric object features and by their estimation using Bayesian filters, such as Kalman or particle filters. In particular, each filter estimates the state of a specific object feature, conditionally dependent on another feature estimated by a distinct filter. This dependence provides improved target representations, permitting us to segment it out from the background even in nonstationary sequences. Considering that the procedure of the Bayesian filters may be described by a "hypotheses generation-hypotheses correction" strategy, the major novelty of our methodology compared to previous approaches is that the mutual dependence between filters is considered during the feature observation, that is, into the "hypotheses-correction" stage, instead of considering it when generating the hypotheses. This proves to be much more effective in terms of accuracy and reliability. The proposed method is analytically justified and applied to develop a robust tracking system that adapts online and simultaneously the color space where the image points are represented, the color distributions, the contour of the object, and its bounding box. Results with synthetic data and real video sequences demonstrate the robustness and versatility of our method.Peer Reviewe

    Parameterized Duration Modeling for Switching Linear Dynamic Systems

    Get PDF
    ©2006 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.Presented at the 2006 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 17-22 June 2006, New York, NY.DOI: 10.1109/CVPR.2006.218We introduce an extension of switching linear dynamic systems (SLDS) with parameterized duration modeling capabilities. The proposed model allows arbitrary duration models and overcomes the limitation of a geometric distribution induced in standard SLDSs. By incorporating a duration model which reflects the data more closely, the resulting model provides reliable inference results which are robust against observation noise. Moreover, existing inference algorithms for SLDSs can be adopted with only modest additional effort in most cases where an SLDS model can be applied. In addition, we observe the fact that the duration models would vary across data sequences in certain domains, which complicates learning and inference tasks. Such variability in duration is overcome by introducing parameterized duration models. The experimental results on honeybee dance decoding tasks demonstrate the robust inference capabilities of the proposed model

    A comparison of data association techniques for Simultaneous Localization and Mapping

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 2005.Includes bibliographical references (p. 119-124).The problem of Simultaneous Localization and Mapping (SLAM) has received a great deal of attention within the robotics literature, and the importance of the solutions to this problem has been well documented for successful operation of autonomous agents in a number of environments. Of the numerous solutions that have been developed for solving the SLAM problem many of the most successful approaches continue to either rely on, or stem from, the Extended Kalman Filter method (EKF). However, the new algorithm FastSLAM has attracted attention for many properties not found in EKF based methods. One such property is the ability to deal with unknown data association and its robustness to data association errors. The problem of data association has also received a great deal of attention in the robotics literature in recent years, and various solutions have been proposed. In an effort to both compare the performance of the EKF and FastSLAM under ambiguous data association situations, as well as compare the performance of three different data association methods a comprehensive study of various SLAM filter-data association combinations is performed. This study will consist of pairing the EKF and FastSLAM filtering approaches with the Joint Compatibility, Sequential Compatibility Nearest Neighbor, and Joint Maximum Likelihood data association methods. The comparison will be based on both contrived simulations as well as application to the publicly available Car Park data set. The simulated results will demonstrate a heavy dependence on geometry, particularly landmark separation, for the performance of both filter performance and the data association algorithms used.(cont.) The real world data set results will demonstrate that the performance of some data association algorithms, when paired with an EKF, can give identical results. At the same time a distinction in mapping performance between those pairings and the EKF paired with Joint Compatibility data association will be shown. These EKF based pairings will be contrasted to the performance obtained for the FastSLAM- Sequential Nearest Neighbor marriage. Finally, the difficulties in applying the Joint Compatibility and Joint Maximum Likelihood data association methods using FastSLAM 1.0 for this data set will be discussed.by Aron J. Cooper.S.M
    corecore