6,861 research outputs found
Clique descriptor of affine invariant regions for robust wide baseline image matching
Assuming that the image distortion between corresponding regions of a stereo pair of images with wide baseline can be approximated as an affine transformation if the regions are reasonably small, recent image matching algorithms have focused on affine invariant region (IR) detection and its description to increase the robustness in matching. However, the distinctiveness of an intensity-based region descriptor tends to deteriorate when an image includes homogeneous texture or repetitive pattern. To address this problem, we investigated the geometry of a local IR cluster (also called a clique) and propose a new clique-based image matching method. In the proposed method, the clique of an IR is estimated by Delaunay triangulation in a local affine frame and the Hausdorff distance is adopted for matching an inexact number of multiple descriptor vectors. We also introduce two adaptively weighted clique distances, where the neighbour distance in a clique is appropriately weighted according to characteristics of the local feature distribution. Experimental results show the clique-based matching method produces more tentative correspondences than variants of the SIFT-based method
Semantic Cross-View Matching
Matching cross-view images is challenging because the appearance and
viewpoints are significantly different. While low-level features based on
gradient orientations or filter responses can drastically vary with such
changes in viewpoint, semantic information of images however shows an invariant
characteristic in this respect. Consequently, semantically labeled regions can
be used for performing cross-view matching. In this paper, we therefore explore
this idea and propose an automatic method for detecting and representing the
semantic information of an RGB image with the goal of performing cross-view
matching with a (non-RGB) geographic information system (GIS). A segmented
image forms the input to our system with segments assigned to semantic concepts
such as traffic signs, lakes, roads, foliage, etc. We design a descriptor to
robustly capture both, the presence of semantic concepts and the spatial layout
of those segments. Pairwise distances between the descriptors extracted from
the GIS map and the query image are then used to generate a shortlist of the
most promising locations with similar semantic concepts in a consistent spatial
layout. An experimental evaluation with challenging query images and a large
urban area shows promising results
Wide baseline stereo matching with convex bounded-distortion constraints
Finding correspondences in wide baseline setups is a challenging problem.
Existing approaches have focused largely on developing better feature
descriptors for correspondence and on accurate recovery of epipolar line
constraints. This paper focuses on the challenging problem of finding
correspondences once approximate epipolar constraints are given. We introduce a
novel method that integrates a deformation model. Specifically, we formulate
the problem as finding the largest number of corresponding points related by a
bounded distortion map that obeys the given epipolar constraints. We show that,
while the set of bounded distortion maps is not convex, the subset of maps that
obey the epipolar line constraints is convex, allowing us to introduce an
efficient algorithm for matching. We further utilize a robust cost function for
matching and employ majorization-minimization for its optimization. Our
experiments indicate that our method finds significantly more accurate maps
than existing approaches
Scale-Adaptive Neural Dense Features: Learning via Hierarchical Context Aggregation
How do computers and intelligent agents view the world around them? Feature
extraction and representation constitutes one the basic building blocks towards
answering this question. Traditionally, this has been done with carefully
engineered hand-crafted techniques such as HOG, SIFT or ORB. However, there is
no ``one size fits all'' approach that satisfies all requirements. In recent
years, the rising popularity of deep learning has resulted in a myriad of
end-to-end solutions to many computer vision problems. These approaches, while
successful, tend to lack scalability and can't easily exploit information
learned by other systems. Instead, we propose SAND features, a dedicated deep
learning solution to feature extraction capable of providing hierarchical
context information. This is achieved by employing sparse relative labels
indicating relationships of similarity/dissimilarity between image locations.
The nature of these labels results in an almost infinite set of dissimilar
examples to choose from. We demonstrate how the selection of negative examples
during training can be used to modify the feature space and vary it's
properties. To demonstrate the generality of this approach, we apply the
proposed features to a multitude of tasks, each requiring different properties.
This includes disparity estimation, semantic segmentation, self-localisation
and SLAM. In all cases, we show how incorporating SAND features results in
better or comparable results to the baseline, whilst requiring little to no
additional training. Code can be found at:
https://github.com/jspenmar/SAND_featuresComment: CVPR201
- …