15 research outputs found

    Real-time Multiple People Tracking with Deeply Learned Candidate Selection and Person Re-Identification

    Full text link
    Online multi-object tracking is a fundamental problem in time-critical video analysis applications. A major challenge in the popular tracking-by-detection framework is how to associate unreliable detection results with existing tracks. In this paper, we propose to handle unreliable detection by collecting candidates from outputs of both detection and tracking. The intuition behind generating redundant candidates is that detection and tracks can complement each other in different scenarios. Detection results of high confidence prevent tracking drifts in the long term, and predictions of tracks can handle noisy detection caused by occlusion. In order to apply optimal selection from a considerable amount of candidates in real-time, we present a novel scoring function based on a fully convolutional neural network, that shares most computations on the entire image. Moreover, we adopt a deeply learned appearance representation, which is trained on large-scale person re-identification datasets, to improve the identification ability of our tracker. Extensive experiments show that our tracker achieves real-time and state-of-the-art performance on a widely used people tracking benchmark.Comment: ICME 201

    Learning Non-Uniform Hypergraph for Multi-Object Tracking

    Full text link
    The majority of Multi-Object Tracking (MOT) algorithms based on the tracking-by-detection scheme do not use higher order dependencies among objects or tracklets, which makes them less effective in handling complex scenarios. In this work, we present a new near-online MOT algorithm based on non-uniform hypergraph, which can model different degrees of dependencies among tracklets in a unified objective. The nodes in the hypergraph correspond to the tracklets and the hyperedges with different degrees encode various kinds of dependencies among them. Specifically, instead of setting the weights of hyperedges with different degrees empirically, they are learned automatically using the structural support vector machine algorithm (SSVM). Several experiments are carried out on various challenging datasets (i.e., PETS09, ParkingLot sequence, SubwayFace, and MOT16 benchmark), to demonstrate that our method achieves favorable performance against the state-of-the-art MOT methods.Comment: 11 pages, 4 figures, accepted by AAAI 201

    CONFIDENCE-AWARE PEDESTRIAN TRACKING USING A STEREO CAMERA

    Get PDF
    Pedestrian tracking is a significant problem in autonomous driving. The majority of studies carries out tracking in the image domain, which is not sufficient for many realistic applications like path planning, collision avoidance, and autonomous navigation. In this study, we address pedestrian tracking using stereo images and tracking-by-detection. Our framework comes in three primary phases: (1) people are detected in image space by the mask R-CNN detector and their positions in 3D-space are computed using stereo information; (2) corresponding detections are assigned to each other across consecutive frames based on visual characteristics and 3D geometry; and (3) the current positions of pedestrians are corrected using their previous states using an extended Kalman filter. We use our tracking-to-confirm-detection method, in which detections are treated differently depending on their confidence metrics. To obtain a high recall value while keeping a low number of false positives. While existing methods consider all target trajectories have equal accuracy, we estimate a confidence value for each trajectory at every epoch. Thus, depending on their confidence values, the targets can have different contributions to the whole tracking system. The performance of our approach is evaluated using the Kitti benchmark dataset. It shows promising results comparable to those of other state-of-the-art methods

    Improving Multi-Frame Data Association with Sparse Representations for Robust Near-Online Multi-Object Tracking

    No full text
    Multiple Object Tracking still remains a difficult problem due to appearance variations and occlusions of the targets or detection failures. Using sophisticated appearance models or performing data association over multiple frames are two common approaches that lead to gain in performances. Inspired by the success of sparse representations in Single Object Tracking, we propose to formulate the multi-frame data association step as an energy minimization problem, designing an energy that efficiently exploits sparse representations of all detections. Furthermore, we propose to use a structured sparsity-inducing norm to compute representations more suited to the tracking context. We perform extensive experiments to demonstrate the effectiveness of the proposed formulation , and evaluate our approach on two public authoritative benchmarks in order to compare it with several state-of-the-art methods

    MOTChallenge: A Benchmark for Single-Camera Multiple Target Tracking

    Full text link
    Standardized benchmarks have been crucial in pushing the performance of computer vision algorithms, especially since the advent of deep learning. Although leaderboards should not be over-claimed, they often provide the most objective measure of performance and are therefore important guides for research. We present MOTChallenge, a benchmark for single-camera Multiple Object Tracking (MOT) launched in late 2014, to collect existing and new data, and create a framework for the standardized evaluation of multiple object tracking methods. The benchmark is focused on multiple people tracking, since pedestrians are by far the most studied object in the tracking community, with applications ranging from robot navigation to self-driving cars. This paper collects the first three releases of the benchmark: (i) MOT15, along with numerous state-of-the-art results that were submitted in the last years, (ii) MOT16, which contains new challenging videos, and (iii) MOT17, that extends MOT16 sequences with more precise labels and evaluates tracking performance on three different object detectors. The second and third release not only offers a significant increase in the number of labeled boxes but also provide labels for multiple object classes beside pedestrians, as well as the level of visibility for every single object of interest. We finally provide a categorization of state-of-the-art trackers and a broad error analysis. This will help newcomers understand the related work and research trends in the MOT community, and hopefully shed some light on potential future research directions.Comment: Accepted at IJC
    corecore