9,858 research outputs found

    Polar Transformation on Image Features for Orientation-Invariant Representations

    Get PDF
    The choice of image feature representation plays a crucial role in the analysis of visual information. Although vast numbers of alternative robust feature representation models have been proposed to improve the performance of different visual tasks, most existing feature representations (e.g. handcrafted features or Convolutional Neural Networks (CNN)) have a relatively limited capacity to capture the highly orientation-invariant (rotation/reversal) features. The net consequence is suboptimal visual performance. To address these problems, this study adopts a novel transformational approach, which investigates the potential of using polar feature representations. Our low level consists of a histogram of oriented gradient, which is then binned using annular spatial bin-type cells applied to the polar gradient. This gives gradient binning invariance for feature extraction. In this way, the descriptors have significantly enhanced orientation-invariant capabilities. The proposed feature representation, termed it orientation-invariant histograms of oriented gradients (Oi-HOG), is capable of accurately processing facial expression recognition (FER). In the context of the CNN architecture, we propose two polar convolution operations, referred to as Full Polar Convolution (FPolarConv) and Local Polar Convolution (LPolarConv), and use these to develop polar architectures for the CNN orientation-invariant representation. Experimental results show that the proposed orientation-invariant image representation, based on polar models for both handcrafted features and deep learning features, is both competitive with state-of-the-art methods and maintains a compact representation on a set of challenging benchmark image datasets

    A survey of visual preprocessing and shape representation techniques

    Get PDF
    Many recent theories and methods proposed for visual preprocessing and shape representation are summarized. The survey brings together research from the fields of biology, psychology, computer science, electrical engineering, and most recently, neural networks. It was motivated by the need to preprocess images for a sparse distributed memory (SDM), but the techniques presented may also prove useful for applying other associative memories to visual pattern recognition. The material of this survey is divided into three sections: an overview of biological visual processing; methods of preprocessing (extracting parts of shape, texture, motion, and depth); and shape representation and recognition (form invariance, primitives and structural descriptions, and theories of attention)

    View-Invariant Object Category Learning, Recognition, and Search: How Spatial and Object Attention Are Coordinated Using Surface-Based Attentional Shrouds

    Full text link
    Air Force Office of Scientific Research (F49620-01-1-0397); National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624

    Visual Representations: Defining Properties and Deep Approximations

    Full text link
    Visual representations are defined in terms of minimal sufficient statistics of visual data, for a class of tasks, that are also invariant to nuisance variability. Minimal sufficiency guarantees that we can store a representation in lieu of raw data with smallest complexity and no performance loss on the task at hand. Invariance guarantees that the statistic is constant with respect to uninformative transformations of the data. We derive analytical expressions for such representations and show they are related to feature descriptors commonly used in computer vision, as well as to convolutional neural networks. This link highlights the assumptions and approximations tacitly assumed by these methods and explains empirical practices such as clamping, pooling and joint normalization.Comment: UCLA CSD TR140023, Nov. 12, 2014, revised April 13, 2015, November 13, 2015, February 28, 201

    Fingerprint Verification Using Spectral Minutiae Representations

    Get PDF
    Most fingerprint recognition systems are based on the use of a minutiae set, which is an unordered collection of minutiae locations and orientations suffering from various deformations such as translation, rotation, and scaling. The spectral minutiae representation introduced in this paper is a novel method to represent a minutiae set as a fixed-length feature vector, which is invariant to translation, and in which rotation and scaling become translations, so that they can be easily compensated for. These characteristics enable the combination of fingerprint recognition systems with template protection schemes that require a fixed-length feature vector. This paper introduces the concept of algorithms for two representation methods: the location-based spectral minutiae representation and the orientation-based spectral minutiae representation. Both algorithms are evaluated using two correlation-based spectral minutiae matching algorithms. We present the performance of our algorithms on three fingerprint databases. We also show how the performance can be improved by using a fusion scheme and singular points

    Oriented Response Networks

    Full text link
    Deep Convolution Neural Networks (DCNNs) are capable of learning unprecedentedly effective image representations. However, their ability in handling significant local and global image rotations remains limited. In this paper, we propose Active Rotating Filters (ARFs) that actively rotate during convolution and produce feature maps with location and orientation explicitly encoded. An ARF acts as a virtual filter bank containing the filter itself and its multiple unmaterialised rotated versions. During back-propagation, an ARF is collectively updated using errors from all its rotated versions. DCNNs using ARFs, referred to as Oriented Response Networks (ORNs), can produce within-class rotation-invariant deep features while maintaining inter-class discrimination for classification tasks. The oriented response produced by ORNs can also be used for image and object orientation estimation tasks. Over multiple state-of-the-art DCNN architectures, such as VGG, ResNet, and STN, we consistently observe that replacing regular filters with the proposed ARFs leads to significant reduction in the number of network parameters and improvement in classification performance. We report the best results on several commonly used benchmarks.Comment: Accepted in CVPR 2017. Source code available at http://yzhou.work/OR
    • 

    corecore