8,451 research outputs found

    VISUAL TRACKING AND ILLUMINATION RECOVERY VIA SPARSE REPRESENTATION

    Get PDF
    Compressive sensing, or sparse representation, has played a fundamental role in many fields of science. It shows that the signals and images can be reconstructed from far fewer measurements than what is usually considered to be necessary. Sparsity leads to efficient estimation, efficient compression, dimensionality reduction, and efficient modeling. Recently, there has been a growing interest in compressive sensing in computer vision and it has been successfully applied to face recognition, background subtraction, object tracking and other problems. Sparsity can be achieved by solving the compressive sensing problem using L1 minimization. In this dissertation, we present the results of a study of applying sparse representation to illumination recovery, object tracking, and simultaneous tracking and recognition. Illumination recovery, also known as inverse lighting, is the problem of recovering an illumination distribution in a scene from the appearance of objects located in the scene. It is used for Augmented Reality, where the virtual objects match the existing image and cast convincing shadows on the real scene rendered with the recovered illumination. Shadows in a scene are caused by the occlusion of incoming light, and thus contain information about the lighting of the scene. Although shadows have been used in determining the 3D shape of the object that casts shadows onto the scene, few studies have focused on the illumination information provided by the shadows. In this dissertation, we recover the illumination of a scene from a single image with cast shadows given the geometry of the scene. The images with cast shadows can be quite complex and therefore cannot be well approximated by low-dimensional linear subspaces. However, in this study we show that the set of images produced by a Lambertian scene with cast shadows can be efficiently represented by a sparse set of images generated by directional light sources. We first model an image with cast shadows as composed of a diffusive part (without cast shadows) and a residual part that captures cast shadows. Then, we express the problem in an L1-regularized least squares formulation, with nonnegativity constraints (as light has to be nonnegative at any point in space). This sparse representation enjoys an effective and fast solution, thanks to recent advances in compressive sensing. In experiments on both synthetic and real data, our approach performs favorably in comparison to several previously proposed methods. Visual tracking, which consistently infers the motion of a desired target in a video sequence, has been an active and fruitful research topic in computer vision for decades. It has many practical applications such as surveillance, human computer interaction, medical imaging and so on. Many challenges to design a robust tracking algorithm come from the enormous unpredictable variations in the target, such as deformations, fast motion, occlusions, background clutter, and lighting changes. To tackle the challenges posed by tracking, we propose a robust visual tracking method by casting tracking as a sparse approximation problem in a particle filter framework. In this framework, occlusion, noise and other challenging issues are addressed seamlessly through a set of trivial templates. Specifically, to find the tracking target at a new frame, each target candidate is sparsely represented in the space spanned by target templates and trivial templates. The sparsity is achieved by solving an L1-regularized least squares problem. Then the candidate with the smallest projection error is taken as the tracking target. After that, tracking is continued using a Bayesian state inference framework in which a particle filter is used for propagating sample distributions over time. Three additional components further improve the robustness of our approach: 1) a velocity incorporated motion model that helps concentrate the samples on the true target location in the next frame, 2) the nonnegativity constraints that help filter out clutter that is similar to tracked targets in reversed intensity patterns, and 3) a dynamic template update scheme that keeps track of the most representative templates throughout the tracking procedure. We test the proposed approach on many challenging sequences involving heavy occlusions, drastic illumination changes, large scale changes, non-rigid object movement, out-of-plane rotation, and large pose variations. The proposed approach shows excellent performance in comparison with four previously proposed trackers. We also extend the work to simultaneous tracking and recognition in vehicle classification in IR video sequences. We attempt to resolve the uncertainties in tracking and recognition at the same time by introducing a static template set that stores target images in various conditions such as different poses, lighting, and so on. The recognition results at each frame are propagated to produce the final result for the whole video. The tracking result is evaluated at each frame and low confidence in tracking performance initiates a new cycle of tracking and classification. We demonstrate the robustness of the proposed method on vehicle tracking and classification using outdoor IR video sequences

    Bags of Affine Subspaces for Robust Object Tracking

    Full text link
    We propose an adaptive tracking algorithm where the object is modelled as a continuously updated bag of affine subspaces, with each subspace constructed from the object's appearance over several consecutive frames. In contrast to linear subspaces, affine subspaces explicitly model the origin of subspaces. Furthermore, instead of using a brittle point-to-subspace distance during the search for the object in a new frame, we propose to use a subspace-to-subspace distance by representing candidate image areas also as affine subspaces. Distances between subspaces are then obtained by exploiting the non-Euclidean geometry of Grassmann manifolds. Experiments on challenging videos (containing object occlusions, deformations, as well as variations in pose and illumination) indicate that the proposed method achieves higher tracking accuracy than several recent discriminative trackers.Comment: in International Conference on Digital Image Computing: Techniques and Applications, 201

    Learning Adaptive Discriminative Correlation Filters via Temporal Consistency Preserving Spatial Feature Selection for Robust Visual Tracking

    Get PDF
    With efficient appearance learning models, Discriminative Correlation Filter (DCF) has been proven to be very successful in recent video object tracking benchmarks and competitions. However, the existing DCF paradigm suffers from two major issues, i.e., spatial boundary effect and temporal filter degradation. To mitigate these challenges, we propose a new DCF-based tracking method. The key innovations of the proposed method include adaptive spatial feature selection and temporal consistent constraints, with which the new tracker enables joint spatial-temporal filter learning in a lower dimensional discriminative manifold. More specifically, we apply structured spatial sparsity constraints to multi-channel filers. Consequently, the process of learning spatial filters can be approximated by the lasso regularisation. To encourage temporal consistency, the filter model is restricted to lie around its historical value and updated locally to preserve the global structure in the manifold. Last, a unified optimisation framework is proposed to jointly select temporal consistency preserving spatial features and learn discriminative filters with the augmented Lagrangian method. Qualitative and quantitative evaluations have been conducted on a number of well-known benchmarking datasets such as OTB2013, OTB50, OTB100, Temple-Colour, UAV123 and VOT2018. The experimental results demonstrate the superiority of the proposed method over the state-of-the-art approaches

    Selective sampling importance resampling particle filter tracking with multibag subspace restoration

    Get PDF

    A graphical model based solution to the facial feature point tracking problem

    Get PDF
    In this paper a facial feature point tracker that is motivated by applications such as human-computer interfaces and facial expression analysis systems is proposed. The proposed tracker is based on a graphical model framework. The facial features are tracked through video streams by incorporating statistical relations in time as well as spatial relations between feature points. By exploiting the spatial relationships between feature points, the proposed method provides robustness in real-world conditions such as arbitrary head movements and occlusions. A Gabor feature-based occlusion detector is developed and used to handle occlusions. The performance of the proposed tracker has been evaluated on real video data under various conditions including occluded facial gestures and head movements. It is also compared to two popular methods, one based on Kalman filtering exploiting temporal relations, and the other based on active appearance models (AAM). Improvements provided by the proposed approach are demonstrated through both visual displays and quantitative analysis
    corecore