59 research outputs found
A New Benchmark for Stereo-Based Pedestrian Detection
Abstract — Pedestrian detection is a rapidly evolving area in the intelligent vehicles domain. Stereo vision is an attractive sensor for this purpose. But unlike for monocular vision, there are no realistic, large scale benchmarks available for stereobased pedestrian detection, to provide a common point of reference for evaluation. This paper introduces the Daimler Stereo-Vision Pedestrian Detection benchmark, which consists of several thousands of pedestrians in the training set, and a 27-min test drive through urban environment and associated vehicle data. The data, including ground truth, is made publicly available for non-commercial purposes. The paper furthermore quantifies the benefit of stereo vision for ROI generation and localization; at equal detection rates, false positives are reduced by a factor of 4-5 with stereo over mono, using the same HOG/linSVM classification component. I
The benefits of dense stereo for pedestrian detection
This paper presents a novel pedestrian detection system for intelligent vehicles. We propose the use of dense stereo for both the generation of regions of interest and pedestrian classification. Dense stereo allows the dynamic estimation of camera parameters and the road profile, which, in turn, provides strong scene constraints on possible pedestrian locations. For classification, we extract spatial features (gradient orientation histograms) directly from dense depth and intensity images. Both modalities are represented in terms of individual feature spaces, in which discriminative classifiers (linear support vector machines) are learned. We refrain from the construction of a joint feature space but instead employ a fusion of depth and intensity on the classifier level. Our experiments involve challenging image data captured in complex urban environments (i.e., undulating roads and speed bumps). Our results show a performance improvement by up to a factor of 7.5 at the classification level and up to a factor of 5 at the tracking level (reduction in false alarms at constant detection rates) over a system with static scene constraints and intensity-only classification
- …