9,292 research outputs found

    Analysis of a biologically-inspired system for real-time object recognition

    Get PDF
    We present a biologically-inspired system for real-time, feed-forward object recognition in cluttered scenes. Our system utilizes a vocabulary of very sparse features that are shared between and within different object models. To detect objects in a novel scene, these features are located in the image, and each detected feature votes for all objects that are consistent with its presence. Due to the sharing of features between object models our approach is more scalable to large object databases than traditional methods. To demonstrate the utility of this approach, we train our system to recognize any of 50 objects in everyday cluttered scenes with substantial occlusion. Without further optimization we also demonstrate near-perfect recognition on a standard 3-D recognition problem. Our system has an interpretation as a sparsely connected feed-forward neural network, making it a viable model for fast, feed-forward object recognition in the primate visual system

    Improvements to context based self-supervised learning

    Full text link
    We develop a set of methods to improve on the results of self-supervised learning using context. We start with a baseline of patch based arrangement context learning and go from there. Our methods address some overt problems such as chromatic aberration as well as other potential problems such as spatial skew and mid-level feature neglect. We prevent problems with testing generalization on common self-supervised benchmark tests by using different datasets during our development. The results of our methods combined yield top scores on all standard self-supervised benchmarks, including classification and detection on PASCAL VOC 2007, segmentation on PASCAL VOC 2012, and "linear tests" on the ImageNet and CSAIL Places datasets. We obtain an improvement over our baseline method of between 4.0 to 7.1 percentage points on transfer learning classification tests. We also show results on different standard network architectures to demonstrate generalization as well as portability. All data, models and programs are available at: https://gdo-datasci.llnl.gov/selfsupervised/.Comment: Accepted paper at CVPR 201

    Registration and Fusion of Multi-Spectral Images Using a Novel Edge Descriptor

    Full text link
    In this paper we introduce a fully end-to-end approach for multi-spectral image registration and fusion. Our method for fusion combines images from different spectral channels into a single fused image by different approaches for low and high frequency signals. A prerequisite of fusion is a stage of geometric alignment between the spectral bands, commonly referred to as registration. Unfortunately, common methods for image registration of a single spectral channel do not yield reasonable results on images from different modalities. For that end, we introduce a new algorithm for multi-spectral image registration, based on a novel edge descriptor of feature points. Our method achieves an accurate alignment of a level that allows us to further fuse the images. As our experiments show, we produce a high quality of multi-spectral image registration and fusion under many challenging scenarios

    Low-rank SIFT: An Affine Invariant Feature for Place Recognition

    Full text link
    In this paper, we present a novel affine-invariant feature based on SIFT, leveraging the regular appearance of man-made objects. The feature achieves full affine invariance without needing to simulate over affine parameter space. Low-rank SIFT, as we name the feature, is based on our observation that local tilt, which are caused by changes of camera axis orientation, could be normalized by converting local patches to standard low-rank forms. Rotation, translation and scaling invariance could be achieved in ways similar to SIFT. As an extension of SIFT, our method seeks to add prior to solve the ill-posed affine parameter estimation problem and normalizes them directly, and is applicable to objects with regular structures. Furthermore, owing to recent breakthrough in convex optimization, such parameter could be computed efficiently. We will demonstrate its effectiveness in place recognition as our major application. As extra contributions, we also describe our pipeline of constructing geotagged building database from the ground up, as well as an efficient scheme for automatic feature selection

    Towards loophole-free Bell inequality test with preselected unsymmetrical singlet states of light

    Full text link
    Can a Bell test with no detection loophole be demonstrated for multi-photon entangled states of light within the current technology? We examine the possibility of a postselection-free CHSH-Bell inequality test wih an unsymmetrical polarization singlet. To that end we employ a preselection procedure which is performed prior to the test. It allows using imperfect (coarse-grained) binary photodetection in the test. We show an example of preselection scheme which improves violation of the CHSH inequality with the micro-macro polarization singlet produced by the optimal quantum cloning. The preselection is realized by a quantum filter which is believed to be not useful for this purpose.Comment: 12 pages, 8 figures, accepted to Phys. Rev.

    Note: An object detection method for active camera

    Get PDF
    To solve the problems caused by a changing background during object detection in active camera, this paper proposes a new method based on SURF (speeded up robust features) and data clustering. The SURF feature points of each image are extracted, and each cluster center is calculated by processing the data clustering of k adjacent frames. Templates for each class are obtained by calculating the histograms within the regions around the center points of the clustering classes. The window of the moving object can be located by finding the region that satisfies the histogram matching result between adjacent frames. Experimental results demonstrate that the proposed method can improve the effectiveness of object detection.Yong Chen, Ronghua Zhang, Lei Shang, and Eric H

    Multi-scale Orderless Pooling of Deep Convolutional Activation Features

    Full text link
    Deep convolutional neural networks (CNN) have shown their promise as a universal representation for recognition. However, global CNN activations lack geometric invariance, which limits their robustness for classification and matching of highly variable scenes. To improve the invariance of CNN activations without degrading their discriminative power, this paper presents a simple but effective scheme called multi-scale orderless pooling (MOP-CNN). This scheme extracts CNN activations for local patches at multiple scale levels, performs orderless VLAD pooling of these activations at each level separately, and concatenates the result. The resulting MOP-CNN representation can be used as a generic feature for either supervised or unsupervised recognition tasks, from image classification to instance-level retrieval; it consistently outperforms global CNN activations without requiring any joint training of prediction layers for a particular target dataset. In absolute terms, it achieves state-of-the-art results on the challenging SUN397 and MIT Indoor Scenes classification datasets, and competitive results on ILSVRC2012/2013 classification and INRIA Holidays retrieval datasets
    • …
    corecore