1,971 research outputs found
Recommended from our members
A biologically inspired spiking model of visual processing for image feature detection
To enable fast reliable feature matching or tracking in scenes, features need to be discrete and meaningful, and hence edge or corner features, commonly called interest points are often used for this purpose. Experimental research has illustrated that biological vision systems use neuronal circuits to extract particular features such as edges or corners from visual scenes. Inspired by this biological behaviour, this paper proposes a biologically inspired spiking neural network for the purpose of image feature extraction. Standard digital images are processed and converted to spikes in a manner similar to the processing that transforms light into spikes in the retina. Using a hierarchical spiking network, various types of biologically inspired receptive fields are used to extract progressively complex image features. The performance of the network is assessed by examining the repeatability of extracted features with visual results presented using both synthetic and real images
Making Affine Correspondences Work in Camera Geometry Computation
Local features e.g. SIFT and its affine and learned variants provide region-to-region rather than point-to-point correspondences. This has recently been exploited to create new minimal solvers for classical problems such as homography, essential and fundamental matrix estimation. The main advantage of such solvers is that their sample size is smaller, e.g., only two instead of four matches are required to estimate a homography. Works proposing such solvers often claim a significant improvement in run-time thanks to fewer RANSAC iterations. We show that this argument is not valid in practice if the solvers are used naively. To overcome this, we propose guidelines for effective use of region-to-region matches in the course of a full model estimation pipeline. We propose a method for refining the local feature geometries by symmetric intensity-based matching, combine uncertainty propagation inside RANSAC with preemptive model verification, show a general scheme for computing uncertainty of minimal solvers results, and adapt the sample cheirality check for homography estimation. Our experiments show that affine solvers can achieve accuracy comparable to point-based solvers at faster run-times when following our guidelines. We make code available at https://github.com/danini/affine-correspondences-for-camera-geometry
Efficient and Scalable 4-th order Match Propagation
International audienceWe propose a robust method to match image feature points taking into account geometric consistency. It is a careful adaptation of the match propagation principle to 4th-order geometric constraints (match quadruple consistency). With our method, a set of matches is explained by a network of locally-similar affinities. This approach is useful when simple descriptor-based matching strategies fail, in particular for highly ambiguous data, e.g., with repetitive patterns or where texture is lacking. As it scales easily to hundreds of thousands of matches, it is also useful when denser point distributions are sought, e.g., for high-precision rigid model estimation. Experiments show that our method is competitive (efficient, scalable, accurate, robust) against state-of-the-art methods in deformable object matching, camera calibration and pattern detection
Image Retrieval based on Bag-of-Words model
This article gives a survey for bag-of-words (BoW) or bag-of-features model
in image retrieval system. In recent years, large-scale image retrieval shows
significant potential in both industry applications and research problems. As
local descriptors like SIFT demonstrate great discriminative power in solving
vision problems like object recognition, image classification and annotation,
more and more state-of-the-art large scale image retrieval systems are trying
to rely on them. A common way to achieve this is first quantizing local
descriptors into visual words, and then applying scalable textual indexing and
retrieval schemes. We call this model as bag-of-words or bag-of-features model.
The goal of this survey is to give an overview of this model and introduce
different strategies when building the system based on this model
View point robust visual search technique
In this thesis, we have explored visual search techniques for images taken from diferent view
points and have tried to enhance the matching capability under view point changes. We have proposed
the Homography based back-projection as post-processing stage of Compact Descriptors for
Visual Search (CDVS), the new MPEG standard; moreover, we have deined the aine adapted
scale space based aine detection, which steers the Gaussian scale space to capture the features
from aine transformed images; we have also developed the corresponding gradient based aine
descriptor. Using these proposed techniques, the image retrieval robustness to aine transformations
has been signiicantly improved.
The irst chapter of this thesis introduces the background on visual search.
In the second chapter, we propose a homography based back-projection used as the postprocessing
stage of CDVS to improve the resilience to view point changes. The theory behind
this proposal is that each perspective projection of the image of 2D object can be simulated as an
aine transformation. Each pair of aine transformations are mathematically related by homography
matrix. Given that matrix, the image can be back-projected to simulate the image of another
view point. In this way, the real matched images can then be declared as matching because the perspective
distortion has been reduced by the back-projection. An accurate homography estimation
from the images of diferent view point requires at least 4 correspondences, which could be ofered
by the CDVS pipeline. In this way, the homography based back-projection can be used to scrutinize
the images with not enough matched keypoints. If they contain some homography relations,
the perspective distortion can then be reduced exploiting the few provided correspondences. In the
experiment, this technique has been proved to be quite efective especially to the 2D object images.
The third chapter introduces the scale space, which is also the kernel to the feature detection
for the scale invariant visual search techniques. Scale space, which is made by a series of Gaussian
blurred images, represents the image structures at diferent level of details. The Gaussian smoothed
images in the scale space result in feature detection being not invariant to aine transformations.
That is the reason why scale invariant visual search techniques are sensitive to aine transformations.
Thus, in this chapter, we propose an aine adapted scale space, which employs the aine
steered Gaussian ilters to smooth the images. This scale space is lexible to diferent aine transformations
and it well represents the image structures from diferent view points. With the help of
this structure, the features from diferent view points can be well captured.
In practice, the scale invariant visual search techniques have employed a pyramid structure
to speed up the construction. By employing the aine Gaussian scale space principles, we also
propose two structures to build the aine Gaussian scale space. The structure of aine Gaussian
scale space is similar to the pyramid structure because of the similiar sampling and cascading
iii
properties. Conversely, the aine Laplacian of Gaussian (LoG) structure is completely diferent.
The Laplacian operator, under aine transformation, is hard to be aine deformed. Diferently from
a simple Laplacian operation on the scale space to build the general LoG construction, the aine
LoG can only be obtained by aine LoG convolution and the cascade implementations on the aine
scale space. Using our proposed structures, both the aine Gaussian scale space and aine LoG can
be constructed.
We have also explored the aine scale space implementation in frequency domain. In the second
chapter, we will also explore the spectrum of Gaussian image smoothing under the aine transformation,
and propose two structures. General speaking, the implementation in frequency domain is
more robust to aine transformations at the expense of a higher computational complexity.
It makes sense to adopt an aine descriptor for the aine invariant visual search. In the fourth
chapter, we will propose an aine invariant feature descriptor based on aine gradient. Currently,
the state of the art feature descriptors, including SIFT and Gradient location and orientation histogram
(GLOH), are based on the histogram of image gradient around the detected features. If
the image gradient is calculated as the diference of the adjacent pixels, it will not be aine invariant.
Thus in that chapter, we irst propose an aine gradient which will contribute the aine
invariance to the descriptor. This aine gradient will be calculated directly by the derivative of the
aine Gaussian blurred images. To simplify the processing, we will also create the corresponding
aine Gaussian derivative ilters for diferent detected scales to quickly generate the aine gradient.
With this aine gradient, we can apply the same scheme of SIFT descriptor to generate the
gradient histogram. By normalizing the histogram, the aine descriptor can then be formed. This
aine descriptor is not only aine invariant but also rotation invariant, because the direction of the
area to form the histogram is determined by the main direction of the gradient around the features.
In practice, this aine descriptor is fully aine invariant and its performance for image matching is
extremely good.
In the conclusions chapter, we draw some conclusions and describe some future work
- …