8,682 research outputs found
Comparing Feature Detectors: A bias in the repeatability criteria, and how to correct it
Most computer vision application rely on algorithms finding local
correspondences between different images. These algorithms detect and compare
stable local invariant descriptors centered at scale-invariant keypoints.
Because of the importance of the problem, new keypoint detectors and
descriptors are constantly being proposed, each one claiming to perform better
(or to be complementary) to the preceding ones. This raises the question of a
fair comparison between very diverse methods. This evaluation has been mainly
based on a repeatability criterion of the keypoints under a series of image
perturbations (blur, illumination, noise, rotations, homotheties, homographies,
etc). In this paper, we argue that the classic repeatability criterion is
biased towards algorithms producing redundant overlapped detections. To
compensate this bias, we propose a variant of the repeatability rate taking
into account the descriptors overlap. We apply this variant to revisit the
popular benchmark by Mikolajczyk et al., on classic and new feature detectors.
Experimental evidence shows that the hierarchy of these feature detectors is
severely disrupted by the amended comparator.Comment: Fixed typo in affiliation
Rotationally invariant 3D shape contexts using asymmetry patterns
This paper presents an approach to resolve the azimuth ambiguity of 3D Shape Contexts (3DSC) based on asymmetry patterns. We show that it is possible to provide rotational invariance to 3DSC at the expense of a marginal increase in computational load, outperforming previous algorithms dealing with the azimuth ambiguity. We build on a recently presented measure of approximate rotational symmetry in 2D defined as the overlapping area between a shape and rotated versions of itself to extract asymmetry patterns from a 3DSC in a variety of ways, depending on the spatial relationships that need to be highlighted or disabled. Thus, we define Asymmetry Patterns Shape Contexts (APSC) from a subset of the possible spatial relations present in the spherical grid of 3DSC; hence they can be thought of as a family of descriptors that depend on the subset that is selected. This provides great flexibility to derive different descriptors. We show that choosing the appropriate spatial patterns can considerably reduce the errors obtained with 3DSC when targeting specific types of points
Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition and Remote Sensing Scene Classification
Designing discriminative powerful texture features robust to realistic
imaging conditions is a challenging computer vision problem with many
applications, including material recognition and analysis of satellite or
aerial imagery. In the past, most texture description approaches were based on
dense orderless statistical distribution of local features. However, most
recent approaches to texture recognition and remote sensing scene
classification are based on Convolutional Neural Networks (CNNs). The d facto
practice when learning these CNN models is to use RGB patches as input with
training performed on large amounts of labeled data (ImageNet). In this paper,
we show that Binary Patterns encoded CNN models, codenamed TEX-Nets, trained
using mapped coded images with explicit texture information provide
complementary information to the standard RGB deep models. Additionally, two
deep architectures, namely early and late fusion, are investigated to combine
the texture and color information. To the best of our knowledge, we are the
first to investigate Binary Patterns encoded CNNs and different deep network
fusion architectures for texture recognition and remote sensing scene
classification. We perform comprehensive experiments on four texture
recognition datasets and four remote sensing scene classification benchmarks:
UC-Merced with 21 scene categories, WHU-RS19 with 19 scene classes, RSSCN7 with
7 categories and the recently introduced large scale aerial image dataset (AID)
with 30 aerial scene types. We demonstrate that TEX-Nets provide complementary
information to standard RGB deep model of the same network architecture. Our
late fusion TEX-Net architecture always improves the overall performance
compared to the standard RGB network on both recognition problems. Our final
combination outperforms the state-of-the-art without employing fine-tuning or
ensemble of RGB network architectures.Comment: To appear in ISPRS Journal of Photogrammetry and Remote Sensin
- …