We describe a method for detecting and tracking humans. Different from most of the previous work, we focus on humans with extensive pose articulations, under situations where there is typically only a single camera, multiple humans are present and the image resolution is low. In our method pose clusters are learned from an embedded silhouette manifold. A set of object detectors, each of which corresponds to one pose cluster, are trained based on a novel Object-Weighted Appearance Model. A probabilistic pose-based transition model is used to track multiple objects within a sliding window buffer, making use of the detection responses. The track segments in the sliding windows are connected sequentially into full trajectories. Experiments on a set of challenging surveillance videos are presented; these show good performance of our approach compared to standard pedestrian detectors, under difficult conditions. 1
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.