39,088 research outputs found

    Efficient online subspace learning with an indefinite kernel for visual tracking and recognition

    Get PDF
    We propose an exact framework for online learning with a family of indefinite (not positive) kernels. As we study the case of nonpositive kernels, we first show how to extend kernel principal component analysis (KPCA) from a reproducing kernel Hilbert space to Krein space. We then formulate an incremental KPCA in Krein space that does not require the calculation of preimages and therefore is both efficient and exact. Our approach has been motivated by the application of visual tracking for which we wish to employ a robust gradient-based kernel. We use the proposed nonlinear appearance model learned online via KPCA in Krein space for visual tracking in many popular and difficult tracking scenarios. We also show applications of our kernel framework for the problem of face recognition

    Incremental learning of 3D-DCT compact representations for robust visual tracking

    Get PDF
    Visual tracking usually requires an object appearance model that is robust to changing illumination, pose and other factors encountered in video. Many recent trackers utilize appearance samples in previous frames to form the bases upon which the object appearance model is built. This approach has the following limitations: (a) the bases are data driven, so they can be easily corrupted; and (b) it is difficult to robustly update the bases in challenging situations. In this paper, we construct an appearance model using the 3D discrete cosine transform (3D-DCT). The 3D-DCT is based on a set of cosine basis functions, which are determined by the dimensions of the 3D signal and thus independent of the input video data. In addition, the 3D-DCT can generate a compact energy spectrum whose high-frequency coefficients are sparse if the appearance samples are similar. By discarding these high-frequency coefficients, we simultaneously obtain a compact 3D-DCT based object representation and a signal reconstruction-based similarity measure (reflecting the information loss from signal reconstruction). To efficiently update the object representation, we propose an incremental 3D-DCT algorithm, which decomposes the 3D-DCT into successive operations of the 2D discrete cosine transform (2D-DCT) and 1D discrete cosine transform (1D-DCT) on the input video data. As a result, the incremental 3D-DCT algorithm only needs to compute the 2D-DCT for newly added frames as well as the 1D-DCT along the third dimension, which significantly reduces the computational complexity. Based on this incremental 3D-DCT algorithm, we design a discriminative criterion to evaluate the likelihood of a test sample belonging to the foreground object. We then embed the discriminative criterion into a particle filtering framework for object state inference over time. Experimental results demonstrate the effectiveness and robustness of the proposed tracker.Xi Li, Anthony Dick, Chunhua Shen, Anton van den Hengel, and Hanzi Wan

    Robust online subspace learning

    No full text
    In this thesis, I aim to advance the theories of online non-linear subspace learning through the development of strategies which are both efficient and robust. The use of subspace learning methods is very popular in computer vision and they have been employed to numerous tasks. With the increasing need for real-time applications, the formulation of online (i.e. incremental and real-time) learning methods is a vibrant research field and has received much attention from the research community. A major advantage of incremental systems is that they update the hypothesis during execution, thus allowing for the incorporation of the real data seen in the testing phase. Tracking acts as an attractive and popular evaluation tool for incremental systems, and thus, the connection between online learning and adaptive tracking is seen commonly in the literature. The proposed system in this thesis facilitates learning from noisy input data, e.g. caused by occlusions, casted shadows and pose variations, that are challenging problems in general tracking frameworks. First, a fast and robust alternative to standard L2-norm principal component analysis (PCA) is introduced, which I coin Euler PCA (e-PCA). The formulation of e-PCA is based on robust, non-linear kernel PCA (KPCA) with a cosine-based kernel function that is expressed via an explicit feature space. When applied to tracking, face reconstruction and background modeling, promising results are achieved. In the second part, the problem of matching vectors of 3D rotations is explicitly targeted. A novel distance which is robust for 3D rotations is introduced, and formulated as a kernel function. The kernel leads to a new representation of 3D rotations, the full-angle quaternion (FAQ) representation. Finally, I propose 3D object recognition from point clouds, and object tracking with color values using FAQs. A domain-specific kernel function designed for visual data is then presented. KPCA with Krein space kernels is introduced, as this kernel is indefinite, and an exact incremental learning framework for the new kernel is developed. In a tracker framework, the presented online learning outperforms the competitors in nine popular and challenging video sequences. In the final part, the generalized eigenvalue problem is studied. Specifically, incremental slow feature analysis (SFA) with indefinite kernels is proposed, and applied to temporal video segmentation and tracking with change detection. As online SFA allows for drift detection, further improvements are achieved in the evaluation of the tracking task.Open Acces

    Bags of Affine Subspaces for Robust Object Tracking

    Full text link
    We propose an adaptive tracking algorithm where the object is modelled as a continuously updated bag of affine subspaces, with each subspace constructed from the object's appearance over several consecutive frames. In contrast to linear subspaces, affine subspaces explicitly model the origin of subspaces. Furthermore, instead of using a brittle point-to-subspace distance during the search for the object in a new frame, we propose to use a subspace-to-subspace distance by representing candidate image areas also as affine subspaces. Distances between subspaces are then obtained by exploiting the non-Euclidean geometry of Grassmann manifolds. Experiments on challenging videos (containing object occlusions, deformations, as well as variations in pose and illumination) indicate that the proposed method achieves higher tracking accuracy than several recent discriminative trackers.Comment: in International Conference on Digital Image Computing: Techniques and Applications, 201

    Non-sparse Linear Representations for Visual Tracking with Online Reservoir Metric Learning

    Get PDF
    Most sparse linear representation-based trackers need to solve a computationally expensive L1-regularized optimization problem. To address this problem, we propose a visual tracker based on non-sparse linear representations, which admit an efficient closed-form solution without sacrificing accuracy. Moreover, in order to capture the correlation information between different feature dimensions, we learn a Mahalanobis distance metric in an online fashion and incorporate the learned metric into the optimization problem for obtaining the linear representation. We show that online metric learning using proximity comparison significantly improves the robustness of the tracking, especially on those sequences exhibiting drastic appearance changes. Furthermore, in order to prevent the unbounded growth in the number of training samples for the metric learning, we design a time-weighted reservoir sampling method to maintain and update limited-sized foreground and background sample buffers for balancing sample diversity and adaptability. Experimental results on challenging videos demonstrate the effectiveness and robustness of the proposed tracker.Comment: Appearing in IEEE Conf. Computer Vision and Pattern Recognition, 201

    Online Metric-Weighted Linear Representations for Robust Visual Tracking

    Full text link
    In this paper, we propose a visual tracker based on a metric-weighted linear representation of appearance. In order to capture the interdependence of different feature dimensions, we develop two online distance metric learning methods using proximity comparison information and structured output learning. The learned metric is then incorporated into a linear representation of appearance. We show that online distance metric learning significantly improves the robustness of the tracker, especially on those sequences exhibiting drastic appearance changes. In order to bound growth in the number of training samples, we design a time-weighted reservoir sampling method. Moreover, we enable our tracker to automatically perform object identification during the process of object tracking, by introducing a collection of static template samples belonging to several object classes of interest. Object identification results for an entire video sequence are achieved by systematically combining the tracking information and visual recognition at each frame. Experimental results on challenging video sequences demonstrate the effectiveness of the method for both inter-frame tracking and object identification.Comment: 51 pages. Appearing in IEEE Transactions on Pattern Analysis and Machine Intelligenc

    Memory Based Online Learning of Deep Representations from Video Streams

    Full text link
    We present a novel online unsupervised method for face identity learning from video streams. The method exploits deep face descriptors together with a memory based learning mechanism that takes advantage of the temporal coherence of visual data. Specifically, we introduce a discriminative feature matching solution based on Reverse Nearest Neighbour and a feature forgetting strategy that detect redundant features and discard them appropriately while time progresses. It is shown that the proposed learning procedure is asymptotically stable and can be effectively used in relevant applications like multiple face identification and tracking from unconstrained video streams. Experimental results show that the proposed method achieves comparable results in the task of multiple face tracking and better performance in face identification with offline approaches exploiting future information. Code will be publicly available.Comment: arXiv admin note: text overlap with arXiv:1708.0361
    • …
    corecore