Search CORE

11 research outputs found

PASCAL VOC Challenge “Lifetime Achievement ” Prize 2010

Author: Advisor Pedro
Advisor Pedro
F. Felzenszwalb
F. Felzenszwalb
Mentor Jitendra Malik
Ross B. Girshick
Publication venue
Publication date
Field of study

Outstanding Reviewer Award CVPR 201

CiteSeerX

Speeding up Convolutional Neural Networks with Low Rank Expansions

Author: Jaderberg Max
Vedaldi Andrea
Zisserman Andrew
Publication venue
Publication date: 01/01/2014
Field of study

The focus of this paper is speeding up the evaluation of convolutional neural networks. While delivering impressive results across a range of computer vision and machine learning tasks, these networks are computationally demanding, limiting their deployability. Convolutional layers generally consume the bulk of the processing time, and so in this work we present two simple schemes for drastically speeding up these layers. This is achieved by exploiting cross-channel or filter redundancy to construct a low rank basis of filters that are rank-1 in the spatial domain. Our methods are architecture agnostic, and can be easily applied to existing CPU and GPU convolutional frameworks for tuneable speedup performance. We demonstrate this with a real world network designed for scene text character recognition, showing a possible 2.5x speedup with no loss in accuracy, and 4.5x speedup with less than 1% drop in accuracy, still achieving state-of-the-art on standard benchmarks

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

Exemplar Based Deep Discriminative and Shareable Feature Learning for Scene Image Classification

Author: Shuai Bing
Wang Gang
Yang Qingxiong
Zhao Lifan
Zuo Zhen
Publication venue
Publication date: 01/01/2015
Field of study

In order to encode the class correlation and class specific information in image representation, we propose a new local feature learning approach named Deep Discriminative and Shareable Feature Learning (DDSFL). DDSFL aims to hierarchically learn feature transformation filter banks to transform raw pixel image patches to features. The learned filter banks are expected to: (1) encode common visual patterns of a flexible number of categories; (2) encode discriminative information; and (3) hierarchically extract patterns at different visual levels. Particularly, in each single layer of DDSFL, shareable filters are jointly learned for classes which share the similar patterns. Discriminative power of the filters is achieved by enforcing the features from the same category to be close, while features from different categories to be far away from each other. Furthermore, we also propose two exemplar selection methods to iteratively select training data for more efficient and effective learning. Based on the experimental results, DDSFL can achieve very promising performance, and it also shows great complementary effect to the state-of-the-art Caffe features.Comment: Pattern Recognition, Elsevier, 201

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

FPM: Fine Pose Parts-Based Model with 3D CAD Models

Author: B. Hariharan
D.F. Fouhey
N. Silberman
P.F. Felzenszwalb
V. Hedau
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

We introduce a novel approach to the problem of localizing objects in an image and estimating their fine-pose. Given exact CAD models, and a few real training images with aligned models, we propose to leverage the geometric information from CAD models and appearance information from real images to learn a model that can accurately estimate fine pose in real images. Specifically, we propose FPM, a fine pose parts-based model, that combines geometric information in the form of shared 3D parts in deformable part based models, and appearance information in the form of objectness to achieve both fast and accurate fine pose estimation. Our method significantly outperforms current state-of-the-art algorithms in both accuracy and speed

CiteSeerX

DSpace@MIT

Crossref

The Fastest Deformable Part Model for Object Detection

Author: Junjie Yan
Longyin Wen
Stan Z. Li
Zhen Lei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

This paper solves the speed bottleneck of deformable part model (DPM), while maintaining the accuracy in de-tection on challenging datasets. Three prohibitive steps in cascade version of DPM are accelerated, including 2D cor-relation between root filter and feature map, cascade part pruning and HOG feature extraction. For 2D correlation, the root filter is constrained to be low rank, so that 2D cor-relation can be calculated by more efficient linear combi-nation of 1D correlations. A proximal gradient algorithm is adopted to progressively learn the low rank filter in a dis-criminative manner. For cascade part pruning, neighbor-hood aware cascade is proposed to capture the dependence in neighborhood regions for aggressive pruning. Instead of explicit computation of part scores, hypotheses can be pruned by scores of neighborhoods under the first order ap-proximation. For HOG feature extraction, look-up tables are constructed to replace expensive calculations of orien-tation partition and magnitude with simpler matrix index operations. Extensive experiments show that (a) the pro-posed method is 4 times faster than the current fastest DPM method with similar accuracy on Pascal VOC, (b) the pro-posed method achieves state-of-the-art accuracy on pedes-trian and face detection task with frame-rate speed. 1

CiteSeerX

Crossref

Learning Everything about Anything: Webly-Supervised Visual Concept Learning

Author: Ali Farhadi
Carlos Guestrin
Santosh K. Divvala
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

Figure 1: We introduce a fully-automated method that, given any concept, discovers an exhaustive vocabulary explaining all its appearance variations (i.e., actions, interactions, attributes, etc.), and trains full-fledged detection models for it. This figure shows a few of the many variations that our method has learned for four different classes of concepts: object (horse), scene (kitchen), event (Christmas), and action (walking). Recognition is graduating from labs to real-world ap-plications. While it is encouraging to see its potential being tapped, it brings forth a fundamental challenge to the vision researcher: scalability. How can we learn a model for any concept that exhaustively covers all its appearance varia-tions, while requiring minimal or no human supervision for compiling the vocabulary of visual variance, gathering the training images and annotations, and learning the models? In this paper, we introduce a fully-automated approach for learning extensive models for a wide range of variations (e.g. actions, interactions, attributes and beyond) within any concept. Our approach leverages vast resources of on-line books to discover the vocabulary of variance, and in-tertwines the data collection and modeling steps to alleviate the need for explicit human supervision in training the mod-els. Our approach organizes the visual knowledge about a concept in a convenient and useful way, enabling a variety of applications across vision and NLP. Our online system has been queried by users to learn models for several inter-esting concepts including breakfast, Gandhi, beautiful, etc. To date, our system has models available for over 50,000 variations within 150 concepts, and has annotated more than 10 million images with bounding boxes. 1

CiteSeerX

Crossref