238 research outputs found

    Deep Motion Features for Visual Tracking

    Full text link
    Robust visual tracking is a challenging computer vision problem, with many real-world applications. Most existing approaches employ hand-crafted appearance features, such as HOG or Color Names. Recently, deep RGB features extracted from convolutional neural networks have been successfully applied for tracking. Despite their success, these features only capture appearance information. On the other hand, motion cues provide discriminative and complementary information that can improve tracking performance. Contrary to visual tracking, deep motion features have been successfully applied for action recognition and video classification tasks. Typically, the motion features are learned by training a CNN on optical flow images extracted from large amounts of labeled videos. This paper presents an investigation of the impact of deep motion features in a tracking-by-detection framework. We further show that hand-crafted, deep RGB, and deep motion features contain complementary information. To the best of our knowledge, we are the first to propose fusing appearance information with deep motion features for visual tracking. Comprehensive experiments clearly suggest that our fusion approach with deep motion features outperforms standard methods relying on appearance information alone.Comment: ICPR 2016. Best paper award in the "Computer Vision and Robot Vision" trac

    Discriminative Scale Space Tracking

    Full text link
    Accurate scale estimation of a target is a challenging research problem in visual object tracking. Most state-of-the-art methods employ an exhaustive scale search to estimate the target size. The exhaustive search strategy is computationally expensive and struggles when encountered with large scale variations. This paper investigates the problem of accurate and robust scale estimation in a tracking-by-detection framework. We propose a novel scale adaptive tracking approach by learning separate discriminative correlation filters for translation and scale estimation. The explicit scale filter is learned online using the target appearance sampled at a set of different scales. Contrary to standard approaches, our method directly learns the appearance change induced by variations in the target scale. Additionally, we investigate strategies to reduce the computational cost of our approach. Extensive experiments are performed on the OTB and the VOT2014 datasets. Compared to the standard exhaustive scale search, our approach achieves a gain of 2.5% in average overlap precision on the OTB dataset. Additionally, our method is computationally efficient, operating at a 50% higher frame rate compared to the exhaustive scale search. Our method obtains the top rank in performance by outperforming 19 state-of-the-art trackers on OTB and 37 state-of-the-art trackers on VOT2014.Comment: To appear in TPAMI. This is the journal extension of the VOT2014-winning DSST tracking metho

    Evaluation of trackers for Pan-Tilt-Zoom Scenarios

    Full text link
    Tracking with a Pan-Tilt-Zoom (PTZ) camera has been a research topic in computer vision for many years. Compared to tracking with a still camera, the images captured with a PTZ camera are highly dynamic in nature because the camera can perform large motion resulting in quickly changing capture conditions. Furthermore, tracking with a PTZ camera involves camera control to position the camera on the target. For successful tracking and camera control, the tracker must be fast enough, or has to be able to predict accurately the next position of the target. Therefore, standard benchmarks do not allow to assess properly the quality of a tracker for the PTZ scenario. In this work, we use a virtual PTZ framework to evaluate different tracking algorithms and compare their performances. We also extend the framework to add target position prediction for the next frame, accounting for camera motion and processing delays. By doing this, we can assess if predicting can make long-term tracking more robust as it may help slower algorithms for keeping the target in the field of view of the camera. Results confirm that both speed and robustness are required for tracking under the PTZ scenario.Comment: 6 pages, 2 figures, International Conference on Pattern Recognition and Artificial Intelligence 201
    • …
    corecore