11 research outputs found

    Joint Group Feature Selection and Discriminative Filter Learning for Robust Visual Object Tracking

    Get PDF
    We propose a new Group Feature Selection method for Discriminative Correlation Filters (GFS-DCF) based visual object tracking. The key innovation of the proposed method is to perform group feature selection across both channel and spatial dimensions, thus to pinpoint the structural relevance of multi-channel features to the filtering system. In contrast to the widely used spatial regularisation or feature selection methods, to the best of our knowledge, this is the first time that channel selection has been advocated for DCF-based tracking. We demonstrate that our GFS-DCF method is able to significantly improve the performance of a DCF tracker equipped with deep neural network features. In addition, our GFS-DCF enables joint feature selection and filter learning, achieving enhanced discrimination and interpretability of the learned filters. To further improve the performance, we adaptively integrate historical information by constraining filters to be smooth across temporal frames, using an efficient low-rank approximation. By design, specific temporal-spatial-channel configurations are dynamically learned in the tracking process, highlighting the relevant features, and alleviating the performance degrading impact of less discriminative representations and reducing information redundancy. The experimental results obtained on OTB2013, OTB2015, VOT2017, VOT2018 and TrackingNet demonstrate the merits of our GFS-DCF and its superiority over the state-of-the-art trackers. The code is publicly available at https://github.com/XU-TIANYANG/GFS-DCF

    A Robust Structured Tracker Using Local Deep Features

    Get PDF
    Deep features extracted from convolutional neural networks have been recently utilized in visual tracking to obtain a generic and semantic representation of target candidates. In this paper, we propose a robust structured tracker using local deep features (STLDF). This tracker exploits the deep features of local patches inside target candidates and sparsely represents them by a set of templates in the particle filter framework. The proposed STLDF utilizes a new optimization model, which employs a group-sparsity regularization term to adopt local and spatial information of the target candidates and attain the spatial layout structure among them. To solve the optimization model, we propose an efficient and fast numerical algorithm that consists of two subproblems with the close-form solutions. Different evaluations in terms of success and precision on the benchmarks of challenging image sequences (e.g., OTB50 and OTB100) demonstrate the superior performance of the STLDF against several state-of-the-art trackers

    Fast Robust Subspace Tracking via PCA in Sparse Data-Dependent Noise

    Full text link
    This work studies the robust subspace tracking (ST) problem. Robust ST can be simply understood as a (slow) time-varying subspace extension of robust PCA. It assumes that the true data lies in a low-dimensional subspace that is either fixed or changes slowly with time. The goal is to track the changing subspaces over time in the presence of additive sparse outliers and to do this quickly (with a short delay). We introduce a "fast" mini-batch robust ST solution that is provably correct under mild assumptions. Here "fast" means two things: (i) the subspace changes can be detected and the subspaces can be tracked with near-optimal delay, and (ii) the time complexity of doing this is the same as that of simple (non-robust) PCA. Our main result assumes piecewise constant subspaces (needed for identifiability), but we also provide a corollary for the case when there is a little change at each time. A second contribution is a novel non-asymptotic guarantee for PCA in linearly data-dependent noise. An important setting where this is useful is for linearly data dependent noise that is sparse with support that changes enough over time. The analysis of the subspace update step of our proposed robust ST solution uses this result.Comment: To appear in IEEE Journal of Special Areas in Information Theor

    Deep Learning and Optimization in Visual Target Tracking

    Get PDF
    Visual tracking is the process of estimating states of a moving object in a dynamic frame sequence. It has been considered as one of the most paramount and challenging topics in computer vision. Although numerous tracking methods have been introduced, developing a robust algorithm that can handle different challenges still remains unsolved. In this dissertation, we introduce four different trackers and evaluate their performance in terms of tracking accuracy on challenging frame sequences. Each of these trackers aims to address the drawbacks of their peers. The first developed method is called a structured multi-task multi-view tracking (SMTMVT) method, which exploits the sparse appearance model in the particle filter frame work to track targets under different challenges. Specifically, we extract features of the target candidates from different views and sparsely represent them by a linear combination of templates of different views. Unlike the conventional sparse trackers, SMTMVT not only jointly considers the relationship between different tasks and different views but also retains the structures among different views in a robust multi-task multi-view formulation. The second developed method is called a structured group local sparse tracker (SGLST), which exploits local patches inside target candidates in the particle filter framework. Unlike the conventional local sparse trackers, the proposed optimization model in SGLST not only adopts local and spatial information of the target candidates but also attains the spatial layout structure among them by employing a group-sparsity regularization term. To solve the optimization model, we propose an efficient numerical algorithm consisting of two subproblems with closed-form solutions. The third developed tracker is called a robust structured tracker using local deep features (STLDF). This tracker exploits the deep features of local patches inside target candidates and sparsely represents them by a set of templates in the particle filter framework. The proposed STLDF utilizes a new optimization model, which employs a group-sparsity regularization term to adopt local and spatial information of the target candidates and attain the spatial layout structure among them. To solve the optimization model, we adopt the alternating direction method of multiplier (ADMM) to design a fast and parallel numerical algorithm by deriving the augmented Lagrangian of the optimization model into two closed-form solution problems: the quadratic problem and the Euclidean norm projection onto probability simplex constraints problem. The fourth developed tracker is called an appearance variation adaptation (AVA) tracker, which aligns the feature distributions of target regions over time by learning an adaptation mask in an adversarial network. The proposed adversarial network consists of a generator and a discriminator network that compete with each other over optimizing a discriminator loss in a mini-max optimization problem. Specifically, the discriminator network aims to distinguish recent target regions from earlier ones by minimizing the discriminator loss, while the generator network aims to produce an adaptation mask to maximize the discriminator loss. We incorporate a gradient reverse layer in the adversarial network to solve the aforementioned mini-max optimization in an end-to-end manner. We compare the performance of the proposed four trackers with the most recent state-of-the-art trackers by doing extensive experiments on publicly available frame sequences, including OTB50, OTB100, VOT2016, and VOT2018 tracking benchmarks
    corecore