3,254 research outputs found
Oriented Response Networks
Deep Convolution Neural Networks (DCNNs) are capable of learning
unprecedentedly effective image representations. However, their ability in
handling significant local and global image rotations remains limited. In this
paper, we propose Active Rotating Filters (ARFs) that actively rotate during
convolution and produce feature maps with location and orientation explicitly
encoded. An ARF acts as a virtual filter bank containing the filter itself and
its multiple unmaterialised rotated versions. During back-propagation, an ARF
is collectively updated using errors from all its rotated versions. DCNNs using
ARFs, referred to as Oriented Response Networks (ORNs), can produce
within-class rotation-invariant deep features while maintaining inter-class
discrimination for classification tasks. The oriented response produced by ORNs
can also be used for image and object orientation estimation tasks. Over
multiple state-of-the-art DCNN architectures, such as VGG, ResNet, and STN, we
consistently observe that replacing regular filters with the proposed ARFs
leads to significant reduction in the number of network parameters and
improvement in classification performance. We report the best results on
several commonly used benchmarks.Comment: Accepted in CVPR 2017. Source code available at http://yzhou.work/OR
Compensating for Large In-Plane Rotations in Natural Images
Rotation invariance has been studied in the computer vision community
primarily in the context of small in-plane rotations. This is usually achieved
by building invariant image features. However, the problem of achieving
invariance for large rotation angles remains largely unexplored. In this work,
we tackle this problem by directly compensating for large rotations, as opposed
to building invariant features. This is inspired by the neuro-scientific
concept of mental rotation, which humans use to compare pairs of rotated
objects. Our contributions here are three-fold. First, we train a Convolutional
Neural Network (CNN) to detect image rotations. We find that generic CNN
architectures are not suitable for this purpose. To this end, we introduce a
convolutional template layer, which learns representations for canonical
'unrotated' images. Second, we use Bayesian Optimization to quickly sift
through a large number of candidate images to find the canonical 'unrotated'
image. Third, we use this method to achieve robustness to large angles in an
image retrieval scenario. Our method is task-agnostic, and can be used as a
pre-processing step in any computer vision system.Comment: Accepted at Indian Conference on Computer Vision, Graphics and Image
Processing (ICVGIP) 201
- …