8 research outputs found

    Residual Transfer Learning for Multiple Object Tracking

    Get PDF
    International audienceTo address the Multiple Object Tracking (MOT) challenge , we propose to enhance the tracklet appearance features , given by a Convolutional Neural Network (CNN), based on the Residual Transfer Learning (RTL) method. Considering that object classification and tracking are significantly different tasks at high level. And that traditional fine-tuning limits the possible variations in all the layers of the network since it changes the last convolutional layers. Beyond that, our proposed method provides more flexibility in terms of modelling the difference between these two tasks with a four-stage training. This transfer approach increases the feature performance compared to traditional CNN fine-tuning. Experiments on the MOT17 challenge show competitive results with the current state-of-the-art methods

    Fusion of Head and Full-Body Detectors for Multi-Object Tracking

    Full text link
    In order to track all persons in a scene, the tracking-by-detection paradigm has proven to be a very effective approach. Yet, relying solely on a single detector is also a major limitation, as useful image information might be ignored. Consequently, this work demonstrates how to fuse two detectors into a tracking system. To obtain the trajectories, we propose to formulate tracking as a weighted graph labeling problem, resulting in a binary quadratic program. As such problems are NP-hard, the solution can only be approximated. Based on the Frank-Wolfe algorithm, we present a new solver that is crucial to handle such difficult problems. Evaluation on pedestrian tracking is provided for multiple scenarios, showing superior results over single detector tracking and standard QP-solvers. Finally, our tracker ranks 2nd on the MOT16 benchmark and 1st on the new MOT17 benchmark, outperforming over 90 trackers.Comment: 10 pages, 4 figures; Winner of the MOT17 challenge; CVPRW 201

    Novel data association methods for online multiple human tracking

    Get PDF
    PhD ThesisVideo-based multiple human tracking has played a crucial role in many applications such as intelligent video surveillance, human behavior analysis, and health-care systems. The detection based tracking framework has become the dominant paradigm in this research eld, and the major task is to accurately perform the data association between detections across the frames. However, online multiple human tracking, which merely relies on the detections given up to the present time for the data association, becomes more challenging with noisy detections, missed detections, and occlusions. To address these challenging problems, there are three novel data association methods for online multiple human tracking are presented in this thesis, which are online group-structured dictionary learning, enhanced detection reliability and multi-level cooperative fusion. The rst proposed method aims to address the noisy detections and occlusions. In this method, sequential Monte Carlo probability hypothesis density (SMC-PHD) ltering is the core element for accomplishing the tracking task, where the measurements are produced by the detection based tracking framework. To enhance the measurement model, a novel adaptive gating strategy is developed to aid the classi cation of measurements. In addition, online group-structured dictionary learning with a maximum voting method is proposed to estimate robustly the target birth intensity. It enables the new-born targets in the tracking process to be accurately initialized from noisy sensor measurements. To improve the adaptability of the group-structured dictionary to target appearance changes, the simultaneous codeword optimization (SimCO) algorithm is employed for the dictionary update. The second proposed method relates to accurate measurement selection of detections, which is further to re ne the noisy detections prior to the tracking pipeline. In order to achieve more reliable measurements in the Gaussian mixture (GM)-PHD ltering process, a global-to-local enhanced con dence rescoring strategy is proposed by exploiting the classi cation power of a mask region-convolutional neural network (R-CNN). Then, an improved pruning algorithm namely soft-aggregated non-maximal suppression (Soft-ANMS) is devised to further enhance the selection step. In addition, to avoid the misuse of ambiguous measurements in the tracking process, person re-identi cation (ReID) features driven by convolutional neural networks (CNNs) are integrated to model the target appearances. The third proposed method focuses on addressing the issues of missed detections and occlusions. This method integrates two human detectors with di erent characteristics (full-body and body-parts) in the GM-PHD lter, and investigates their complementary bene ts for tracking multiple targets. For each detector domain, a novel discriminative correlation matching (DCM) model for integration in the feature-level fusion is proposed, and together with spatio-temporal information is used to reduce the ambiguous identity associations in the GM-PHD lter. Moreover, a robust fusion center is proposed within the decision-level fusion to mitigate the sensitivity of missed detections in the fusion process, thereby improving the fusion performance and tracking consistency. The e ectiveness of these proposed methods are investigated using the MOTChallenge benchmark, which is a framework for the standardized evaluation of multiple object tracking methods. Detailed evaluations on challenging video datasets, as well as comparisons with recent state-of-the-art techniques, con rm the improved multiple human tracking performance
    corecore