7,136 research outputs found
Aggregated Deep Local Features for Remote Sensing Image Retrieval
Remote Sensing Image Retrieval remains a challenging topic due to the special
nature of Remote Sensing Imagery. Such images contain various different
semantic objects, which clearly complicates the retrieval task. In this paper,
we present an image retrieval pipeline that uses attentive, local convolutional
features and aggregates them using the Vector of Locally Aggregated Descriptors
(VLAD) to produce a global descriptor. We study various system parameters such
as the multiplicative and additive attention mechanisms and descriptor
dimensionality. We propose a query expansion method that requires no external
inputs. Experiments demonstrate that even without training, the local
convolutional features and global representation outperform other systems.
After system tuning, we can achieve state-of-the-art or competitive results.
Furthermore, we observe that our query expansion method increases overall
system performance by about 3%, using only the top-three retrieved images.
Finally, we show how dimensionality reduction produces compact descriptors with
increased retrieval performance and fast retrieval computation times, e.g. 50%
faster than the current systems.Comment: Published in Remote Sensing. The first two authors have equal
contributio
Coding local and global binary visual features extracted from video sequences
Binary local features represent an effective alternative to real-valued
descriptors, leading to comparable results for many visual analysis tasks,
while being characterized by significantly lower computational complexity and
memory requirements. When dealing with large collections, a more compact
representation based on global features is often preferred, which can be
obtained from local features by means of, e.g., the Bag-of-Visual-Word (BoVW)
model. Several applications, including for example visual sensor networks and
mobile augmented reality, require visual features to be transmitted over a
bandwidth-limited network, thus calling for coding techniques that aim at
reducing the required bit budget, while attaining a target level of efficiency.
In this paper we investigate a coding scheme tailored to both local and global
binary features, which aims at exploiting both spatial and temporal redundancy
by means of intra- and inter-frame coding. In this respect, the proposed coding
scheme can be conveniently adopted to support the Analyze-Then-Compress (ATC)
paradigm. That is, visual features are extracted from the acquired content,
encoded at remote nodes, and finally transmitted to a central controller that
performs visual analysis. This is in contrast with the traditional approach, in
which visual content is acquired at a node, compressed and then sent to a
central unit for further processing, according to the Compress-Then-Analyze
(CTA) paradigm. In this paper we experimentally compare ATC and CTA by means of
rate-efficiency curves in the context of two different visual analysis tasks:
homography estimation and content-based retrieval. Our results show that the
novel ATC paradigm based on the proposed coding primitives can be competitive
with CTA, especially in bandwidth limited scenarios.Comment: submitted to IEEE Transactions on Image Processin
Multi-Layer Local Graph Words for Object Recognition
In this paper, we propose a new multi-layer structural approach for the task
of object based image retrieval. In our work we tackle the problem of
structural organization of local features. The structural features we propose
are nested multi-layered local graphs built upon sets of SURF feature points
with Delaunay triangulation. A Bag-of-Visual-Words (BoVW) framework is applied
on these graphs, giving birth to a Bag-of-Graph-Words representation. The
multi-layer nature of the descriptors consists in scaling from trivial Delaunay
graphs - isolated feature points - by increasing the number of nodes layer by
layer up to graphs with maximal number of nodes. For each layer of graphs its
own visual dictionary is built. The experiments conducted on the SIVAL and
Caltech-101 data sets reveal that the graph features at different layers
exhibit complementary performances on the same content and perform better than
baseline BoVW approach. The combination of all existing layers, yields
significant improvement of the object recognition performance compared to
single level approaches.Comment: International Conference on MultiMedia Modeling, Klagenfurt :
Autriche (2012
Bag-of-Features Image Indexing and Classification in Microsoft SQL Server Relational Database
This paper presents a novel relational database architecture aimed to visual
objects classification and retrieval. The framework is based on the
bag-of-features image representation model combined with the Support Vector
Machine classification and is integrated in a Microsoft SQL Server database.Comment: 2015 IEEE 2nd International Conference on Cybernetics (CYBCONF),
Gdynia, Poland, 24-26 June 201
Image Reconstruction from Bag-of-Visual-Words
The objective of this work is to reconstruct an original image from
Bag-of-Visual-Words (BoVW). Image reconstruction from features can be a means
of identifying the characteristics of features. Additionally, it enables us to
generate novel images via features. Although BoVW is the de facto standard
feature for image recognition and retrieval, successful image reconstruction
from BoVW has not been reported yet. What complicates this task is that BoVW
lacks the spatial information for including visual words. As described in this
paper, to estimate an original arrangement, we propose an evaluation function
that incorporates the naturalness of local adjacency and the global position,
with a method to obtain related parameters using an external image database. To
evaluate the performance of our method, we reconstruct images of objects of 101
kinds. Additionally, we apply our method to analyze object classifiers and to
generate novel images via BoVW
- …