4,476 research outputs found
Coding local and global binary visual features extracted from video sequences
Binary local features represent an effective alternative to real-valued
descriptors, leading to comparable results for many visual analysis tasks,
while being characterized by significantly lower computational complexity and
memory requirements. When dealing with large collections, a more compact
representation based on global features is often preferred, which can be
obtained from local features by means of, e.g., the Bag-of-Visual-Word (BoVW)
model. Several applications, including for example visual sensor networks and
mobile augmented reality, require visual features to be transmitted over a
bandwidth-limited network, thus calling for coding techniques that aim at
reducing the required bit budget, while attaining a target level of efficiency.
In this paper we investigate a coding scheme tailored to both local and global
binary features, which aims at exploiting both spatial and temporal redundancy
by means of intra- and inter-frame coding. In this respect, the proposed coding
scheme can be conveniently adopted to support the Analyze-Then-Compress (ATC)
paradigm. That is, visual features are extracted from the acquired content,
encoded at remote nodes, and finally transmitted to a central controller that
performs visual analysis. This is in contrast with the traditional approach, in
which visual content is acquired at a node, compressed and then sent to a
central unit for further processing, according to the Compress-Then-Analyze
(CTA) paradigm. In this paper we experimentally compare ATC and CTA by means of
rate-efficiency curves in the context of two different visual analysis tasks:
homography estimation and content-based retrieval. Our results show that the
novel ATC paradigm based on the proposed coding primitives can be competitive
with CTA, especially in bandwidth limited scenarios.Comment: submitted to IEEE Transactions on Image Processin
Adding Cues to Binary Feature Descriptors for Visual Place Recognition
In this paper we propose an approach to embed continuous and selector cues in
binary feature descriptors used for visual place recognition. The embedding is
achieved by extending each feature descriptor with a binary string that encodes
a cue and supports the Hamming distance metric. Augmenting the descriptors in
such a way has the advantage of being transparent to the procedure used to
compare them. We present two concrete applications of our methodology,
demonstrating the two considered types of cues. In addition to that, we
conducted on these applications a broad quantitative and comparative evaluation
covering five benchmark datasets and several state-of-the-art image retrieval
approaches in combination with various binary descriptor types.Comment: 8 pages, 8 figures, source: www.gitlab.com/srrg-software/srrg_bench,
submitted to ICRA 201
High-Precision Localization Using Ground Texture
Location-aware applications play an increasingly critical role in everyday
life. However, satellite-based localization (e.g., GPS) has limited accuracy
and can be unusable in dense urban areas and indoors. We introduce an
image-based global localization system that is accurate to a few millimeters
and performs reliable localization both indoors and outside. The key idea is to
capture and index distinctive local keypoints in ground textures. This is based
on the observation that ground textures including wood, carpet, tile, concrete,
and asphalt may look random and homogeneous, but all contain cracks, scratches,
or unique arrangements of fibers. These imperfections are persistent, and can
serve as local features. Our system incorporates a downward-facing camera to
capture the fine texture of the ground, together with an image processing
pipeline that locates the captured texture patch in a compact database
constructed offline. We demonstrate the capability of our system to robustly,
accurately, and quickly locate test images on various types of outdoor and
indoor ground surfaces
- …