35,298 research outputs found

    Better features to track by estimating the tracking convergence region

    Get PDF
    Reliably tracking key points and textured patches from frame to frame is the basic requirement for many bottomup computer vision algorithms. The problem of selecting the features that can be tracked well is addressed here. The Lucas-Kanade tracking procedure is commonly used. We propose a method to estimate the size of the tracking procedure convergence region for each feature. The features that have a wider convergence region around them should be tracked better by the tracker. The size of the convergence region as a new feature goodness measure is compared with the widely accepted Shi-Tomasi feature selection criteria

    Efficient illumination independent appearance-based face tracking

    Get PDF
    One of the major challenges that visual tracking algorithms face nowadays is being able to cope with changes in the appearance of the target during tracking. Linear subspace models have been extensively studied and are possibly the most popular way of modelling target appearance. We introduce a linear subspace representation in which the appearance of a face is represented by the addition of two approxi- mately independent linear subspaces modelling facial expressions and illumination respectively. This model is more compact than previous bilinear or multilinear ap- proaches. The independence assumption notably simplifies system training. We only require two image sequences. One facial expression is subject to all possible illumina- tions in one sequence and the face adopts all facial expressions under one particular illumination in the other. This simple model enables us to train the system with no manual intervention. We also revisit the problem of efficiently fitting a linear subspace-based model to a target image and introduce an additive procedure for solving this problem. We prove that Matthews and Baker’s Inverse Compositional Approach makes a smoothness assumption on the subspace basis that is equiva- lent to Hager and Belhumeur’s, which worsens convergence. Our approach differs from Hager and Belhumeur’s additive and Matthews and Baker’s compositional ap- proaches in that we make no smoothness assumptions on the subspace basis. In the experiments conducted we show that the model introduced accurately represents the appearance variations caused by illumination changes and facial expressions. We also verify experimentally that our fitting procedure is more accurate and has better convergence rate than the other related approaches, albeit at the expense of a slight increase in computational cost. Our approach can be used for tracking a human face at standard video frame rates on an average personal computer

    Reducing “Structure from Motion”: a general framework for dynamic vision. 2. Implementation and experimental assessment

    Get PDF
    For pt.1 see ibid., p.933-42 (1998). A number of methods have been proposed in the literature for estimating scene-structure and ego-motion from a sequence of images using dynamical models. Despite the fact that all methods may be derived from a “natural” dynamical model within a unified framework, from an engineering perspective there are a number of trade-offs that lead to different strategies depending upon the applications and the goals one is targeting. We want to characterize and compare the properties of each model such that the engineer may choose the one best suited to the specific application. We analyze the properties of filters derived from each dynamical model under a variety of experimental conditions, assess the accuracy of the estimates, their robustness to measurement noise, sensitivity to initial conditions and visual angle, effects of the bas-relief ambiguity and occlusions, dependence upon the number of image measurements and their sampling rate

    Two Timescale Convergent Q-learning for Sleep--Scheduling in Wireless Sensor Networks

    Full text link
    In this paper, we consider an intrusion detection application for Wireless Sensor Networks (WSNs). We study the problem of scheduling the sleep times of the individual sensors to maximize the network lifetime while keeping the tracking error to a minimum. We formulate this problem as a partially-observable Markov decision process (POMDP) with continuous state-action spaces, in a manner similar to (Fuemmeler and Veeravalli [2008]). However, unlike their formulation, we consider infinite horizon discounted and average cost objectives as performance criteria. For each criterion, we propose a convergent on-policy Q-learning algorithm that operates on two timescales, while employing function approximation to handle the curse of dimensionality associated with the underlying POMDP. Our proposed algorithm incorporates a policy gradient update using a one-simulation simultaneous perturbation stochastic approximation (SPSA) estimate on the faster timescale, while the Q-value parameter (arising from a linear function approximation for the Q-values) is updated in an on-policy temporal difference (TD) algorithm-like fashion on the slower timescale. The feature selection scheme employed in each of our algorithms manages the energy and tracking components in a manner that assists the search for the optimal sleep-scheduling policy. For the sake of comparison, in both discounted and average settings, we also develop a function approximation analogue of the Q-learning algorithm. This algorithm, unlike the two-timescale variant, does not possess theoretical convergence guarantees. Finally, we also adapt our algorithms to include a stochastic iterative estimation scheme for the intruder's mobility model. Our simulation results on a 2-dimensional network setting suggest that our algorithms result in better tracking accuracy at the cost of only a few additional sensors, in comparison to a recent prior work

    Generalized Kernel-based Visual Tracking

    Full text link
    In this work we generalize the plain MS trackers and attempt to overcome standard mean shift trackers' two limitations. It is well known that modeling and maintaining a representation of a target object is an important component of a successful visual tracker. However, little work has been done on building a robust template model for kernel-based MS tracking. In contrast to building a template from a single frame, we train a robust object representation model from a large amount of data. Tracking is viewed as a binary classification problem, and a discriminative classification rule is learned to distinguish between the object and background. We adopt a support vector machine (SVM) for training. The tracker is then implemented by maximizing the classification score. An iterative optimization scheme very similar to MS is derived for this purpose.Comment: 12 page

    Reducing "Structure From Motion": a General Framework for Dynamic Vision - Part 2: Experimental Evaluation

    Get PDF
    A number of methods have been proposed in the literature for estimating scene-structure and ego-motion from a sequence of images using dynamical models. Although all methods may be derived from a "natural" dynamical model within a unified framework, from an engineering perspective there are a number of trade-offs that lead to different strategies depending upon the specific applications and the goals one is targeting. Which one is the winning strategy? In this paper we analyze the properties of the dynamical models that originate from each strategy under a variety of experimental conditions. For each model we assess the accuracy of the estimates, their robustness to measurement noise, sensitivity to initial conditions and visual angle, effects of the bas-relief ambiguity and occlusions, dependence upon the number of image measurements and their sampling rate
    corecore