8,019 research outputs found

    Local Feature Detectors, Descriptors, and Image Representations: A Survey

    Full text link
    With the advances in both stable interest region detectors and robust and distinctive descriptors, local feature-based image or object retrieval has become a popular research topic. %All of the local feature-based image retrieval system involves two important processes: local feature extraction and image representation. The other key technology for image retrieval systems is image representation such as the bag-of-visual words (BoVW), Fisher vector, or Vector of Locally Aggregated Descriptors (VLAD) framework. In this paper, we review local features and image representations for image retrieval. Because many and many methods are proposed in this area, these methods are grouped into several classes and summarized. In addition, recent deep learning-based approaches for image retrieval are briefly reviewed.Comment: 20 page

    From handcrafted to deep local features

    Full text link
    This paper presents an overview of the evolution of local features from handcrafted to deep-learning-based methods, followed by a discussion of several benchmarks and papers evaluating such local features. Our investigations are motivated by 3D reconstruction problems, where the precise location of the features is important. As we describe these methods, we highlight and explain the challenges of feature extraction and potential ways to overcome them. We first present handcrafted methods, followed by methods based on classical machine learning and finally we discuss methods based on deep-learning. This largely chronologically-ordered presentation will help the reader to fully understand the topic of image and region description in order to make best use of it in modern computer vision applications. In particular, understanding handcrafted methods and their motivation can help to understand modern approaches and how machine learning is used to improve the results. We also provide references to most of the relevant literature and code.Comment: Preprin

    Ray-tracing for coordinate knowledge in the JWST Integrated Science Instrument Module

    Full text link
    Optical alignment and testing of the Integrated Science Instrument Module of the James Webb Space Telescope is underway. We describe the Optical Telescope Element Simulator used to feed the science instruments with point images of precisely known location and chief ray pointing, at appropriate wavelengths and flux levels, in vacuum and at operating temperature. The simulator's capabilities include a number of devices for in situ monitoring of source flux, wavefront error, pupil illumination, image position and chief ray angle. Taken together, these functions become a fascinating example of how the first order properties and constructs of an optical design (coordinate systems, image surface and pupil location) acquire measurable meaning in a real system. We illustrate these functions with experimental data, and describe the ray tracing system used to provide both pointing control during operation and analysis support subsequently. Prescription management takes the form of optimization and fitting. Our core tools employ a matrix/vector ray tracing model which proves broadly useful in optical engineering problems. We spell out its mathematical basis, and illustrate its use in ray tracing plane mirror systems relevant to optical metrology such as a pentaprism and corner cube.Comment: paper IM3A.1} of the International Optical Design Conference (IODC), Fairmont Orchid, Kohala Coast, Hawaii, USA (2014

    Friction from Reflectance: Deep Reflectance Codes for Predicting Physical Surface Properties from One-Shot In-Field Reflectance

    Full text link
    Images are the standard input for vision algorithms, but one-shot infield reflectance measurements are creating new opportunities for recognition and scene understanding. In this work, we address the question of what reflectance can reveal about materials in an efficient manner. We go beyond the question of recognition and labeling and ask the question: What intrinsic physical properties of the surface can be estimated using reflectance? We introduce a framework that enables prediction of actual friction values for surfaces using one-shot reflectance measurements. This work is a first of its kind vision-based friction estimation. We develop a novel representation for reflectance disks that capture partial BRDF measurements instantaneously. Our method of deep reflectance codes combines CNN features and fisher vector pooling with optimal binary embedding to create codes that have sufficient discriminatory power and have important properties of illumination and spatial invariance. The experimental results demonstrate that reflectance can play a new role in deciphering the underlying physical properties of real-world scenes

    Recent Advance in Content-based Image Retrieval: A Literature Survey

    Full text link
    The explosive increase and ubiquitous accessibility of visual data on the Web have led to the prosperity of research activity in image search or retrieval. With the ignorance of visual content as a ranking clue, methods with text search techniques for visual retrieval may suffer inconsistency between the text words and visual content. Content-based image retrieval (CBIR), which makes use of the representation of visual content to identify relevant images, has attracted sustained attention in recent two decades. Such a problem is challenging due to the intention gap and the semantic gap problems. Numerous techniques have been developed for content-based image retrieval in the last decade. The purpose of this paper is to categorize and evaluate those algorithms proposed during the period of 2003 to 2016. We conclude with several promising directions for future research.Comment: 22 page

    Riemannian Dictionary Learning and Sparse Coding for Positive Definite Matrices

    Full text link
    Data encoded as symmetric positive definite (SPD) matrices frequently arise in many areas of computer vision and machine learning. While these matrices form an open subset of the Euclidean space of symmetric matrices, viewing them through the lens of non-Euclidean Riemannian geometry often turns out to be better suited in capturing several desirable data properties. However, formulating classical machine learning algorithms within such a geometry is often non-trivial and computationally expensive. Inspired by the great success of dictionary learning and sparse coding for vector-valued data, our goal in this paper is to represent data in the form of SPD matrices as sparse conic combinations of SPD atoms from a learned dictionary via a Riemannian geometric approach. To that end, we formulate a novel Riemannian optimization objective for dictionary learning and sparse coding in which the representation loss is characterized via the affine invariant Riemannian metric. We also present a computationally simple algorithm for optimizing our model. Experiments on several computer vision datasets demonstrate superior classification and retrieval performance using our approach when compared to sparse coding via alternative non-Riemannian formulations

    Mesh Interest Point Detection Based on Geometric Measures and Sparse Refinement

    Full text link
    Three dimensional (3D) interest point detection plays a fundamental role in 3D computer vision and graphics. In this paper, we introduce a new method for detecting mesh interest points based on geometric measures and sparse refinement (GMSR). The key point of our approach is to calculate the 3D interest point response function using two intuitive and effective geometric properties of the local surface on a 3D mesh model, namely Euclidean distances between the neighborhood vertices to the tangent plane of a vertex and the angles of normal vectors of them. The response function is defined in multi-scale space and can be utilized to effectively distinguish 3D interest points from edges and flat areas. Those points with local maximal 3D interest point response value are selected as the candidates of 3D interest points. Finally, we utilize an β„“0\ell_0 norm based optimization method to refine the candidates of 3D interest points by constraining its quality and quantity. Numerical experiments demonstrate that our proposed GMSR based 3D interest point detector outperforms current several state-of-the-art methods for different kinds of 3D mesh models.Comment: 17 page

    Profile Based Sub-Image Search in Image Databases

    Full text link
    Sub-image search with high accuracy in natural images still remains a challenging problem. This paper proposes a new feature vector called profile for a keypoint in a bag of visual words model of an image. The profile of a keypoint captures the spatial geometry of all the other keypoints in an image with respect to itself, and is very effective in discriminating true matches from false matches. Sub-image search using profiles is a single-phase process requiring no geometric validation, yields high precision on natural images, and works well on small visual codebook. The proposed search technique differs from traditional methods that first generate a set of candidates disregarding spatial information and then verify them geometrically. Conventional methods also use large codebooks. We achieve a precision of 81% on a combined data set of synthetic and real natural images using a codebook size of 500 for top-10 queries; that is 31% higher than the conventional candidate generation approach.Comment: Sub-Image Retrieval, New Feature Vector, Similarit

    Enhancing the retrieval performance by combing the texture and edge features

    Full text link
    In this paper, anew algorithm which is based on geometrical moments and local binary patterns (LBP) for content based image retrieval (CBIR) is proposed. In geometrical moments, each vector is compared with the all other vectors for edge map generation. The same concept is utilized at LBP calculation which is generating nine LBP patterns from a given 3x3 pattern. Finally, nine LBP histograms are calculated which are used as a feature vector for image retrieval. Moments are important features used in recognition of different types of images. Two experiments have been carried out for proving the worth of our algorithm. The results after being investigated shows a significant improvement in terms of their evaluation measures as compared to LBP and other existing transform domain techniques.Comment: 7 pages,8 figures, one tabl

    What is the right way to represent document images?

    Full text link
    In this article we study the problem of document image representation based on visual features. We propose a comprehensive experimental study that compares three types of visual document image representations: (1) traditional so-called shallow features, such as the RunLength and the Fisher-Vector descriptors, (2) deep features based on Convolutional Neural Networks, and (3) features extracted from hybrid architectures that take inspiration from the two previous ones. We evaluate these features in several tasks (i.e. classification, clustering, and retrieval) and in different setups (e.g. domain transfer) using several public and in-house datasets. Our results show that deep features generally outperform other types of features when there is no domain shift and the new task is closely related to the one used to train the model. However, when a large domain or task shift is present, the Fisher-Vector shallow features generalize better and often obtain the best results
    • …
    corecore