Search CORE

46 research outputs found

Unsupervised Network Pretraining via Encoding Human Design

Author: Chen Xi
Liu Ming-Yu
Mallya Arun
Tuzel Oncel C.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/01/2016
Field of study

Over the years, computer vision researchers have spent an immense amount of effort on designing image features for the visual object recognition task. We propose to incorporate this valuable experience to guide the task of training deep neural networks. Our idea is to pretrain the network through the task of replicating the process of hand-designed feature extraction. By learning to replicate the process, the neural network integrates previous research knowledge and learns to model visual objects in a way similar to the hand-designed features. In the succeeding finetuning step, it further learns object-specific representations from labeled data and this boosts its classification power. We pretrain two convolutional neural networks where one replicates the process of histogram of oriented gradients feature extraction, and the other replicates the process of region covariance feature extraction. After finetuning, we achieve substantially better performance than the baseline methods.Comment: 9 pages, 11 figures, WACV 2016: IEEE Conference on Applications of Computer Visio

arXiv.org e-Print Archive

Crossref

Grid Loss: Detecting Occluded Faces

Author: C Dubout
C Garcia
D Chen
H Rowley
J Yan
M Everingham
M Mathias
N Srivastava
P Dollár
P Viola
PF Felzenszwalb
R Vaillant
S Zafeiriou
Publication venue
Publication date: 01/09/2016
Field of study

Detection of partially occluded objects is a challenging computer vision problem. Standard Convolutional Neural Network (CNN) detectors fail if parts of the detection window are occluded, since not every sub-part of the window is discriminative on its own. To address this issue, we propose a novel loss layer for CNNs, named grid loss, which minimizes the error rate on sub-blocks of a convolution layer independently rather than over the whole feature map. This results in parts being more discriminative on their own, enabling the detector to recover if the detection window is partially occluded. By mapping our loss layer back to a regular fully connected layer, no additional computational cost is incurred at runtime compared to standard CNNs. We demonstrate our method for face detection on several public face detection benchmarks and show that our method outperforms regular CNNs, is suitable for realtime applications and achieves state-of-the-art performance.Comment: accepted to ECCV 201

arXiv.org e-Print Archive

Crossref

Strengthening the Effectiveness of Pedestrian Detection with Spatially Pooled Features

Author: Hengel Anton van den
Paisitkriangkrai Sakrapee
Shen Chunhua
Publication venue
Publication date: 01/01/2014
Field of study

We propose a simple yet effective approach to the problem of pedestrian detection which outperforms the current state-of-the-art. Our new features are built on the basis of low-level visual features and spatial pooling. Incorporating spatial pooling improves the translational invariance and thus the robustness of the detection process. We then directly optimise the partial area under the ROC curve (\pAUC) measure, which concentrates detection performance in the range of most practical importance. The combination of these factors leads to a pedestrian detector which outperforms all competitors on all of the standard benchmark datasets. We advance state-of-the-art results by lowering the average miss rate from

13\%

11\%

on the INRIA benchmark,

41\%

37\%

on the ETH benchmark,

51\%

42\%

on the TUD-Brussels benchmark and

36\%

29\%

on the Caltech-USA benchmark.Comment: 16 pages. Appearing in Proc. European Conf. Computer Vision (ECCV) 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Adelaide Research & Scholarship

Learning Complexity-Aware Cascades for Deep Pedestrian Detection

Author: Cai Zhaowei
Saberian Mohammad
Vasconcelos Nuno
Publication venue
Publication date: 19/07/2015
Field of study

The design of complexity-aware cascaded detectors, combining features of very different complexities, is considered. A new cascade design procedure is introduced, by formulating cascade learning as the Lagrangian optimization of a risk that accounts for both accuracy and complexity. A boosting algorithm, denoted as complexity aware cascade training (CompACT), is then derived to solve this optimization. CompACT cascades are shown to seek an optimal trade-off between accuracy and complexity by pushing features of higher complexity to the later cascade stages, where only a few difficult candidate patches remain to be classified. This enables the use of features of vastly different complexities in a single detector. In result, the feature pool can be expanded to features previously impractical for cascade design, such as the responses of a deep convolutional neural network (CNN). This is demonstrated through the design of a pedestrian detector with a pool of features whose complexities span orders of magnitude. The resulting cascade generalizes the combination of a CNN with an object proposal mechanism: rather than a pre-processing stage, CompACT cascades seamlessly integrate CNNs in their stages. This enables state of the art performance on the Caltech and KITTI datasets, at fairly fast speeds

arXiv.org e-Print Archive

Crossref