7 research outputs found
Aggregated Channels Network for Real-Time Pedestrian Detection
Convolutional neural networks (CNNs) have demonstrated their superiority in
numerous computer vision tasks, yet their computational cost results
prohibitive for many real-time applications such as pedestrian detection which
is usually performed on low-consumption hardware. In order to alleviate this
drawback, most strategies focus on using a two-stage cascade approach.
Essentially, in the first stage a fast method generates a significant but
reduced amount of high quality proposals that later, in the second stage, are
evaluated by the CNN. In this work, we propose a novel detection pipeline that
further benefits from the two-stage cascade strategy. More concretely, the
enriched and subsequently compressed features used in the first stage are
reused as the CNN input. As a consequence, a simpler network architecture,
adapted for such small input sizes, allows to achieve real-time performance and
obtain results close to the state-of-the-art while running significantly faster
without the use of GPU. In particular, considering that the proposed pipeline
runs in frame rate, the achieved performance is highly competitive. We
furthermore demonstrate that the proposed pipeline on itself can serve as an
effective proposal generator
Pedestrian Segmentation from Uncalibrated Monocular Videos Using a Projection Map
We present a new method for segmenting the foreground region of the image of multiple pedestrians from monocular surveillance videos. This method requires neither camera calibration nor planar ground assumption. The size and orientation of a pedestrian projection are estimated at each image point and registered in a pedestrian projection map. Individual pedestrians are segmented from the foreground region of input images using an expectation maximization (EM) algorithm and the constructed map.11Nsciescopu
Improving Object Localization Using Macrofeature Layout Selection
A macrofeature layout selection is proposed for object detection. Macrofeatures [2] are mid-level features that jointly encode a set of low-level features in a neighborhood. Our method employs line, triangle, and pyramid layouts, which are composed of several local blocks in a multi-scale feature pyramid. The method is integrated into boosting for detection, where the best layout is selected for a weak classifier at each iteration. The proposed algorithm is applied to pedestrian detection and compared with several state-ofthe-art techniques in public datasets. 1
Macrofeature layout selection for pedestrian localization and its acceleration using GPU
Macrofeatures are mid-level features that jointly encode a set of low-level features in a neighborhood. We propose a macrofeature layout selection technique to improve localization performance in an object detection task. Our method employs line, triangle, and pyramid layouts, which are composed of several local blocks represented by the Histograms of Oriented Gradients (HOGs) features in a multi-scale feature pyramid. Such macrofeature layouts are integrated into a boosting framework for object detection, where the best layout is selected to build a weak classifier in a greedy manner at each iteration. The proposed algorithm is applied to pedestrian detection and implemented using GPU. Our pedestrian detection algorithm performs better in terms of detection and localization accuracy with great efficiency when compared to several state-of-the-art techniques in public datasets. (C) 2013 Elsevier Inc. All rights reserved.X1111sciescopu
Learning Occlusion with Likelihoods for Visual Tracking
We propose a novel algorithm to detect occlusion for visual tracking through learning with observation likelihoods. In our technique, target is divided into regular grid cells and the state of occlusion is determined for each cell using a classifier. Each cell in the target is associated with many small patches, and the patch likelihoods observed during tracking construct a feature vector, which is used for classification. Since the occlusion is learned with patch likelihoods instead of patches themselves, the classifier is universally applicable to any videos or objects for occlusion reasoning. Our occlusion detection algorithm has decent performance in accuracy, which is sufficient to improve tracking performance significantly. The proposed algorithm can be combined with many generic tracking methods, and we adopt L1 minimization tracker to test the performance of our framework. The advantage of our algorithm is supported by quantitative and qualitative evaluation, and successful tracking and occlusion reasoning results are illustrated in many challenging video sequences. 1
Generalized Background Subtraction Based on Hybrid Inference by Belief Propagation and Bayesian Filtering
We propose a novel background subtraction algorithm for the videos captured by a moving camera. In our technique, foreground and background appearance models in each frame are constructed and propagated sequentially by Bayesian filtering. We estimate the posterior of appearance, which is computed by the product of the image likelihood in the current frame and the prior appearance propagated from the previous frame. The motion, which transfers the previous appearance models to the current frame, is estimated by nonparametric belief propagation; the initial motion field is obtained by optical flow and noisy and incomplete motions are corrected effectively through the inference procedure. Our framework is represented by a graphical model, where the sequential inference of motion and appearance is performed by the combination of belief propagation and Bayesian filtering. We compare our algorithm with the existing state-of-the-art technique and evaluate its performance quantitatively and qualitatively in several challenging videos. 1