1,768 research outputs found

    On Rendering Synthetic Images for Training an Object Detector

    Get PDF
    We propose a novel approach to synthesizing images that are effective for training object detectors. Starting from a small set of real images, our algorithm estimates the rendering parameters required to synthesize similar images given a coarse 3D model of the target object. These parameters can then be reused to generate an unlimited number of training images of the object of interest in arbitrary 3D poses, which can then be used to increase classification performances. A key insight of our approach is that the synthetically generated images should be similar to real images, not in terms of image quality, but rather in terms of features used during the detector training. We show in the context of drone, plane, and car detection that using such synthetically generated images yields significantly better performances than simply perturbing real images or even synthesizing images in such way that they look very realistic, as is often done when only limited amounts of training data are available

    Strengthening the Effectiveness of Pedestrian Detection with Spatially Pooled Features

    Full text link
    We propose a simple yet effective approach to the problem of pedestrian detection which outperforms the current state-of-the-art. Our new features are built on the basis of low-level visual features and spatial pooling. Incorporating spatial pooling improves the translational invariance and thus the robustness of the detection process. We then directly optimise the partial area under the ROC curve (\pAUC) measure, which concentrates detection performance in the range of most practical importance. The combination of these factors leads to a pedestrian detector which outperforms all competitors on all of the standard benchmark datasets. We advance state-of-the-art results by lowering the average miss rate from 13%13\% to 11%11\% on the INRIA benchmark, 41%41\% to 37%37\% on the ETH benchmark, 51%51\% to 42%42\% on the TUD-Brussels benchmark and 36%36\% to 29%29\% on the Caltech-USA benchmark.Comment: 16 pages. Appearing in Proc. European Conf. Computer Vision (ECCV) 201
    • …
    corecore