thesis

Generic multiple object tracking

Abstract

Multiple object tracking is an important problem in the computer vision community due to its applications, including but not limited to, visual surveillance, crowd behavior analysis and robotics. The difficulties of this problem lie in several challenges such as frequent occlusion, interaction, high-degree articulation, etc. In recent years, data association based approaches have been successful in tracking multiple pedestrians on top of specific kinds of object detectors. Thus these approaches are type-specific. This may constrain their application in scenario where type-specific object detectors are unavailable. In view of this, I investigate in this thesis tracking multiple objects without ready-to-use and type-specific object detectors. More specifically, the problem of multiple object tracking is generalized to tracking targets of a generic type. Namely, objects to be tracked are no longer constrained to be a specific kind of objects. This problem is termed as Generic Multiple Object Tracking (GMOT), which is handled by three approaches presented in this thesis. In the first approach, a generic object detector is learned based on manual annotation of only one initial bounding box. Then the detector is employed to regularize the online learning procedure of multiple trackers which are specialized to each object. More specifically, multiple trackers are learned simultaneously with shared features and are guided to keep close to the detector. Experimental results have shown considerable improvement on this problem compared with the state-of-the-art methods. The second approach treats detection and tracking of multiple generic objects as a bi-label propagation procedure, which is consisted of class label propagation (detection) and object label propagation (tracking). In particular, the cluster Multiple Task Learning (cMTL) is employed along with the spatio-temporal consistency to address the online detection problem. The tracking problem is addressed by associating existing trajectories with new detection responses considering appearance, motion and context information. The advantages of this approach is verified by extensive experiments on several public data sets. The aforementioned two approaches handle GMOT in an online manner. In contrast, a batch method is proposed in the third work. It dynamically clusters given detection hypotheses into groups corresponding to individual objects. Inspired by the success of topic model in tackling textual tasks, Dirichlet Process Mixture Model (DPMM) is utilized to address the tracking problem by cooperating with the so-called must-links and cannot-links, which are proposed to avoid physical collision. Moreover, two kinds of representations, superpixel and Deformable Part Model (DPM), are introduced to track both rigid and non-rigid objects. Effectiveness of the proposed method is demonstrated with experiments on public data sets.Open Acces

    Similar works