32,044 research outputs found

    An Experimental Survey on Correlation Filter-based Tracking

    Full text link
    Over these years, Correlation Filter-based Trackers (CFTs) have aroused increasing interests in the field of visual object tracking, and have achieved extremely compelling results in different competitions and benchmarks. In this paper, our goal is to review the developments of CFTs with extensive experimental results. 11 trackers are surveyed in our work, based on which a general framework is summarized. Furthermore, we investigate different training schemes for correlation filters, and also discuss various effective improvements that have been made recently. Comprehensive experiments have been conducted to evaluate the effectiveness and efficiency of the surveyed CFTs, and comparisons have been made with other competing trackers. The experimental results have shown that state-of-art performance, in terms of robustness, speed and accuracy, can be achieved by several recent CFTs, such as MUSTer and SAMF. We find that further improvements for correlation filter-based tracking can be made on estimating scales, applying part-based tracking strategy and cooperating with long-term tracking methods.Comment: 13 pages, 25 figure

    Effective Occlusion Handling for Fast Correlation Filter-based Trackers

    Full text link
    Correlation filter-based trackers heavily suffer from the problem of multiple peaks in their response maps incurred by occlusions. Moreover, the whole tracking pipeline may break down due to the uncertainties brought by shifting among peaks, which will further lead to the degraded correlation filter model. To alleviate the drift problem caused by occlusions, we propose a novel scheme to choose the specific filter model according to different scenarios. Specifically, an effective measurement function is designed to evaluate the quality of filter response. A sophisticated strategy is employed to judge whether occlusions occur, and then decide how to update the filter models. In addition, we take advantage of both log-polar method and pyramid-like approach to estimate the best scale of the target. We evaluate our proposed approach on VOT2018 challenge and OTB100 dataset, whose experimental result shows that the proposed tracker achieves the promising performance compared against the state-of-the-art trackers

    Part-based Visual Tracking via Structural Support Correlation Filter

    Full text link
    Recently, part-based and support vector machines (SVM) based trackers have shown favorable performance. Nonetheless, the time-consuming online training and updating process limit their real-time applications. In order to better deal with the partial occlusion issue and improve their efficiency, we propose a novel part-based structural support correlation filter tracking method, which absorbs the strong discriminative ability from SVM and the excellent property of part-based tracking methods which is less sensitive to partial occlusion. Then, our proposed model can learn the support correlation filter of each part jointly by a star structure model, which preserves the spatial layout structure among parts and tolerates outliers of parts. In addition, to mitigate the issue of drift away from object further, we introduce inter-frame consistencies of local parts into our model. Finally, in our model, we accurately estimate the scale changes of object by the relative distance change among reliable parts. The extensive empirical evaluations on three benchmark datasets: OTB2015, TempleColor128 and VOT2015 demonstrate that the proposed method performs superiorly against several state-of-the-art trackers in terms of tracking accuracy, speed and robustness

    Tracking for Half an Hour

    Full text link
    Long-term tracking requires extreme stability to the multitude of model updates and robustness to the disappearance and loss of the target as such will inevitably happen. For motivation, we have taken 10 randomly selected OTB-sequences, doubled each by attaching a reversed version and repeated each double sequence 20 times. On most of these repetitive videos, the best current tracker performs worse on each loop. This illustrates the difference between optimization for short-term versus long-term tracking. In a long-term tracker a combined global and local search strategy is beneficial, allowing for recovery from failures and disappearance. Most importantly, the proposed tracker also employs cautious updating, guided by self-quality assessment. The proposed tracker is still among the best on the 20-sec OTB-videos while achieving state-of-the-art on the 100-sec UAV20L benchmark. On 10 new half-an-hour videos with city bicycling, sport games etc, the proposed tracker outperforms others by a large margin where the 2010 TLD tracker comes second.Comment: tech repor

    Efficient Discriminative Nonorthogonal Binary Subspace with its Application to Visual Tracking

    Full text link
    One of the crucial problems in visual tracking is how the object is represented. Conventional appearance-based trackers are using increasingly more complex features in order to be robust. However, complex representations typically not only require more computation for feature extraction, but also make the state inference complicated. We show that with a careful feature selection scheme, extremely simple yet discriminative features can be used for robust object tracking. The central component of the proposed method is a succinct and discriminative representation of the object using discriminative non-orthogonal binary subspace (DNBS) which is spanned by Haar-like features. The DNBS representation inherits the merits of the original NBS in that it efficiently describes the object. It also incorporates the discriminative information to distinguish foreground from background. However, the problem of finding the DNBS bases from an over-complete dictionary is NP-hard. We propose a greedy algorithm called discriminative optimized orthogonal matching pursuit (D-OOMP) to solve this problem. An iterative formulation named iterative D-OOMP is further developed to drastically reduce the redundant computation between iterations and a hierarchical selection strategy is integrated for reducing the search space of features. The proposed DNBS representation is applied to object tracking through SSD-based template matching. We validate the effectiveness of our method through extensive experiments on challenging videos with comparisons against several state-of-the-art trackers and demonstrate its capability to track objects in clutter and moving background.Comment: 15 page

    Pixel-Level Matching for Video Object Segmentation using Convolutional Neural Networks

    Full text link
    We propose a novel video object segmentation algorithm based on pixel-level matching using Convolutional Neural Networks (CNN). Our network aims to distinguish the target area from the background on the basis of the pixel-level similarity between two object units. The proposed network represents a target object using features from different depth layers in order to take advantage of both the spatial details and the category-level semantic information. Furthermore, we propose a feature compression technique that drastically reduces the memory requirements while maintaining the capability of feature representation. Two-stage training (pre-training and fine-tuning) allows our network to handle any target object regardless of its category (even if the object's type does not belong to the pre-training data) or of variations in its appearance through a video sequence. Experiments on large datasets demonstrate the effectiveness of our model - against related methods - in terms of accuracy, speed, and stability. Finally, we introduce the transferability of our network to different domains, such as the infrared data domain.Comment: To appear on ICCV 201

    Recurrent Filter Learning for Visual Tracking

    Full text link
    Recently using convolutional neural networks (CNNs) has gained popularity in visual tracking, due to its robust feature representation of images. Recent methods perform online tracking by fine-tuning a pre-trained CNN model to the specific target object using stochastic gradient descent (SGD) back-propagation, which is usually time-consuming. In this paper, we propose a recurrent filter generation methods for visual tracking. We directly feed the target's image patch to a recurrent neural network (RNN) to estimate an object-specific filter for tracking. As the video sequence is a spatiotemporal data, we extend the matrix multiplications of the fully-connected layers of the RNN to a convolution operation on feature maps, which preserves the target's spatial structure and also is memory-efficient. The tracked object in the subsequent frames will be fed into the RNN to adapt the generated filters to appearance variations of the target. Note that once the off-line training process of our network is finished, there is no need to fine-tune the network for specific objects, which makes our approach more efficient than methods that use iterative fine-tuning to online learn the target. Extensive experiments conducted on widely used benchmarks, OTB and VOT, demonstrate encouraging results compared to other recent methods.Comment: ICCV2017 Workshop on VO

    Patch-based adaptive weighting with segmentation and scale (PAWSS) for visual tracking

    Full text link
    Tracking-by-detection algorithms are widely used for visual tracking, where the problem is treated as a classification task where an object model is updated over time using online learning techniques. In challenging conditions where an object undergoes deformation or scale variations, the update step is prone to include background information in the model appearance or to lack the ability to estimate the scale change, which degrades the performance of the classifier. In this paper, we incorporate a Patch-based Adaptive Weighting with Segmentation and Scale (PAWSS) tracking framework that tackles both the scale and background problems. A simple but effective colour-based segmentation model is used to suppress background information and multi-scale samples are extracted to enrich the training pool, which allows the tracker to handle both incremental and abrupt scale variations between frames. Experimentally, we evaluate our approach on the online tracking benchmark (OTB) dataset and Visual Object Tracking (VOT) challenge datasets. The results show that our approach outperforms recent state-of-the-art trackers, and it especially improves the successful rate score on the OTB dataset, while on the VOT datasets, PAWSS ranks among the top trackers while operating at real-time frame rates.Comment: 10 pages, 8 figures. The paper is under consideration at Pattern Recognition Letter

    Fully distributed cooperation for networked uncertain mobile manipulators

    Full text link
    This paper investigates a fully distributed cooperation scheme for networked mobile manipulators. To achieve cooperative task allocation in a distributed way, an adaptation-based estimation law is established for each robotic agent to estimate the desired local trajectory. In addition, wrench synthesis is analyzed in detail to lay a solid foundation for tight cooperation tasks. Together with the estimated task, a set of distributed adaptive controllers is proposed to achieve motion synchronization of the mobile manipulator ensemble over a directed graph with a spanning tree irrespective of the kinematic and dynamic uncertainties in both the mobile manipulators and the tightly grasped object. The controlled synchronization alleviates the performance degradation caused by the estimation/tracking discrepancy during the transient phase. The proposed scheme requires no persistent excitation condition and avoids the use of noisy Cartesian-space velocities. Furthermore, it is independent from the object's center of mass by employing formation-based task allocation and a task-oriented strategy. These attractive attributes facilitate the practical application of the scheme. It is theoretically proven that convergence of the cooperative task tracking error is guaranteed. Simulation results validate the efficacy and demonstrate the expected performance of the proposed scheme.Comment: 18 pages with 13 figures. Final version with experiment to appear in IEEE Transactions on Robotic

    Detection, Recognition and Tracking of Moving Objects from Real-time Video via SP Theory of Intelligence and Species Inspired PSO

    Full text link
    In this paper, we address the basic problem of recognizing moving objects in video images using SP Theory of Intelligence. The concept of SP Theory of Intelligence which is a framework of artificial intelligence, was first introduced by Gerard J Wolff, where S stands for Simplicity and P stands for Power. Using the concept of multiple alignment, we detect and recognize object of our interest in video frames with multilevel hierarchical parts and subparts, based on polythetic categories. We track the recognized objects using the species based Particle Swarm Optimization (PSO). First, we extract the multiple alignment of our object of interest from training images. In order to recognize accurately and handle occlusion, we use the polythetic concepts on raw data line to omit the redundant noise via searching for best alignment representing the features from the extracted alignments. We recognize the domain of interest from the video scenes in form of wide variety of multiple alignments to handle scene variability. Unsupervised learning is done in the SP model following the DONSVIC principle and natural structures are discovered via information compression and pattern analysis. After successful recognition of objects, we use species based PSO algorithm as the alignments of our object of interest is analogues to observation likelihood and fitness ability of species. Subsequently, we analyze the competition and repulsion among species with annealed Gaussian based PSO. We have tested our algorithms on David, Walking2, FaceOcc1, Jogging and Dudek, obtaining very satisfactory and competitive results
    • …
    corecore