8,019 research outputs found
Local Feature Detectors, Descriptors, and Image Representations: A Survey
With the advances in both stable interest region detectors and robust and
distinctive descriptors, local feature-based image or object retrieval has
become a popular research topic. %All of the local feature-based image
retrieval system involves two important processes: local feature extraction and
image representation. The other key technology for image retrieval systems is
image representation such as the bag-of-visual words (BoVW), Fisher vector, or
Vector of Locally Aggregated Descriptors (VLAD) framework. In this paper, we
review local features and image representations for image retrieval. Because
many and many methods are proposed in this area, these methods are grouped into
several classes and summarized. In addition, recent deep learning-based
approaches for image retrieval are briefly reviewed.Comment: 20 page
From handcrafted to deep local features
This paper presents an overview of the evolution of local features from
handcrafted to deep-learning-based methods, followed by a discussion of several
benchmarks and papers evaluating such local features. Our investigations are
motivated by 3D reconstruction problems, where the precise location of the
features is important. As we describe these methods, we highlight and explain
the challenges of feature extraction and potential ways to overcome them. We
first present handcrafted methods, followed by methods based on classical
machine learning and finally we discuss methods based on deep-learning. This
largely chronologically-ordered presentation will help the reader to fully
understand the topic of image and region description in order to make best use
of it in modern computer vision applications. In particular, understanding
handcrafted methods and their motivation can help to understand modern
approaches and how machine learning is used to improve the results. We also
provide references to most of the relevant literature and code.Comment: Preprin
Ray-tracing for coordinate knowledge in the JWST Integrated Science Instrument Module
Optical alignment and testing of the Integrated Science Instrument Module of
the James Webb Space Telescope is underway. We describe the Optical Telescope
Element Simulator used to feed the science instruments with point images of
precisely known location and chief ray pointing, at appropriate wavelengths and
flux levels, in vacuum and at operating temperature. The simulator's
capabilities include a number of devices for in situ monitoring of source flux,
wavefront error, pupil illumination, image position and chief ray angle. Taken
together, these functions become a fascinating example of how the first order
properties and constructs of an optical design (coordinate systems, image
surface and pupil location) acquire measurable meaning in a real system. We
illustrate these functions with experimental data, and describe the ray tracing
system used to provide both pointing control during operation and analysis
support subsequently. Prescription management takes the form of optimization
and fitting. Our core tools employ a matrix/vector ray tracing model which
proves broadly useful in optical engineering problems. We spell out its
mathematical basis, and illustrate its use in ray tracing plane mirror systems
relevant to optical metrology such as a pentaprism and corner cube.Comment: paper IM3A.1} of the International Optical Design Conference (IODC),
Fairmont Orchid, Kohala Coast, Hawaii, USA (2014
Friction from Reflectance: Deep Reflectance Codes for Predicting Physical Surface Properties from One-Shot In-Field Reflectance
Images are the standard input for vision algorithms, but one-shot infield
reflectance measurements are creating new opportunities for recognition and
scene understanding. In this work, we address the question of what reflectance
can reveal about materials in an efficient manner. We go beyond the question of
recognition and labeling and ask the question: What intrinsic physical
properties of the surface can be estimated using reflectance? We introduce a
framework that enables prediction of actual friction values for surfaces using
one-shot reflectance measurements. This work is a first of its kind
vision-based friction estimation. We develop a novel representation for
reflectance disks that capture partial BRDF measurements instantaneously. Our
method of deep reflectance codes combines CNN features and fisher vector
pooling with optimal binary embedding to create codes that have sufficient
discriminatory power and have important properties of illumination and spatial
invariance. The experimental results demonstrate that reflectance can play a
new role in deciphering the underlying physical properties of real-world
scenes
Recent Advance in Content-based Image Retrieval: A Literature Survey
The explosive increase and ubiquitous accessibility of visual data on the Web
have led to the prosperity of research activity in image search or retrieval.
With the ignorance of visual content as a ranking clue, methods with text
search techniques for visual retrieval may suffer inconsistency between the
text words and visual content. Content-based image retrieval (CBIR), which
makes use of the representation of visual content to identify relevant images,
has attracted sustained attention in recent two decades. Such a problem is
challenging due to the intention gap and the semantic gap problems. Numerous
techniques have been developed for content-based image retrieval in the last
decade. The purpose of this paper is to categorize and evaluate those
algorithms proposed during the period of 2003 to 2016. We conclude with several
promising directions for future research.Comment: 22 page
Riemannian Dictionary Learning and Sparse Coding for Positive Definite Matrices
Data encoded as symmetric positive definite (SPD) matrices frequently arise
in many areas of computer vision and machine learning. While these matrices
form an open subset of the Euclidean space of symmetric matrices, viewing them
through the lens of non-Euclidean Riemannian geometry often turns out to be
better suited in capturing several desirable data properties. However,
formulating classical machine learning algorithms within such a geometry is
often non-trivial and computationally expensive. Inspired by the great success
of dictionary learning and sparse coding for vector-valued data, our goal in
this paper is to represent data in the form of SPD matrices as sparse conic
combinations of SPD atoms from a learned dictionary via a Riemannian geometric
approach. To that end, we formulate a novel Riemannian optimization objective
for dictionary learning and sparse coding in which the representation loss is
characterized via the affine invariant Riemannian metric. We also present a
computationally simple algorithm for optimizing our model. Experiments on
several computer vision datasets demonstrate superior classification and
retrieval performance using our approach when compared to sparse coding via
alternative non-Riemannian formulations
Mesh Interest Point Detection Based on Geometric Measures and Sparse Refinement
Three dimensional (3D) interest point detection plays a fundamental role in
3D computer vision and graphics. In this paper, we introduce a new method for
detecting mesh interest points based on geometric measures and sparse
refinement (GMSR). The key point of our approach is to calculate the 3D
interest point response function using two intuitive and effective geometric
properties of the local surface on a 3D mesh model, namely Euclidean distances
between the neighborhood vertices to the tangent plane of a vertex and the
angles of normal vectors of them. The response function is defined in
multi-scale space and can be utilized to effectively distinguish 3D interest
points from edges and flat areas. Those points with local maximal 3D interest
point response value are selected as the candidates of 3D interest points.
Finally, we utilize an norm based optimization method to refine the
candidates of 3D interest points by constraining its quality and quantity.
Numerical experiments demonstrate that our proposed GMSR based 3D interest
point detector outperforms current several state-of-the-art methods for
different kinds of 3D mesh models.Comment: 17 page
Profile Based Sub-Image Search in Image Databases
Sub-image search with high accuracy in natural images still remains a
challenging problem. This paper proposes a new feature vector called profile
for a keypoint in a bag of visual words model of an image. The profile of a
keypoint captures the spatial geometry of all the other keypoints in an image
with respect to itself, and is very effective in discriminating true matches
from false matches. Sub-image search using profiles is a single-phase process
requiring no geometric validation, yields high precision on natural images, and
works well on small visual codebook. The proposed search technique differs from
traditional methods that first generate a set of candidates disregarding
spatial information and then verify them geometrically. Conventional methods
also use large codebooks. We achieve a precision of 81% on a combined data set
of synthetic and real natural images using a codebook size of 500 for top-10
queries; that is 31% higher than the conventional candidate generation
approach.Comment: Sub-Image Retrieval, New Feature Vector, Similarit
Enhancing the retrieval performance by combing the texture and edge features
In this paper, anew algorithm which is based on geometrical moments and local
binary patterns (LBP) for content based image retrieval (CBIR) is proposed. In
geometrical moments, each vector is compared with the all other vectors for
edge map generation. The same concept is utilized at LBP calculation which is
generating nine LBP patterns from a given 3x3 pattern. Finally, nine LBP
histograms are calculated which are used as a feature vector for image
retrieval. Moments are important features used in recognition of different
types of images. Two experiments have been carried out for proving the worth of
our algorithm. The results after being investigated shows a significant
improvement in terms of their evaluation measures as compared to LBP and other
existing transform domain techniques.Comment: 7 pages,8 figures, one tabl
What is the right way to represent document images?
In this article we study the problem of document image representation based
on visual features. We propose a comprehensive experimental study that compares
three types of visual document image representations: (1) traditional so-called
shallow features, such as the RunLength and the Fisher-Vector descriptors, (2)
deep features based on Convolutional Neural Networks, and (3) features
extracted from hybrid architectures that take inspiration from the two previous
ones.
We evaluate these features in several tasks (i.e. classification, clustering,
and retrieval) and in different setups (e.g. domain transfer) using several
public and in-house datasets. Our results show that deep features generally
outperform other types of features when there is no domain shift and the new
task is closely related to the one used to train the model. However, when a
large domain or task shift is present, the Fisher-Vector shallow features
generalize better and often obtain the best results
- β¦