8,769 research outputs found
Unsupervised Graph-based Rank Aggregation for Improved Retrieval
This paper presents a robust and comprehensive graph-based rank aggregation
approach, used to combine results of isolated ranker models in retrieval tasks.
The method follows an unsupervised scheme, which is independent of how the
isolated ranks are formulated. Our approach is able to combine arbitrary
models, defined in terms of different ranking criteria, such as those based on
textual, image or hybrid content representations.
We reformulate the ad-hoc retrieval problem as a document retrieval based on
fusion graphs, which we propose as a new unified representation model capable
of merging multiple ranks and expressing inter-relationships of retrieval
results automatically. By doing so, we claim that the retrieval system can
benefit from learning the manifold structure of datasets, thus leading to more
effective results. Another contribution is that our graph-based aggregation
formulation, unlike existing approaches, allows for encapsulating contextual
information encoded from multiple ranks, which can be directly used for
ranking, without further computations and post-processing steps over the
graphs. Based on the graphs, a novel similarity retrieval score is formulated
using an efficient computation of minimum common subgraphs. Finally, another
benefit over existing approaches is the absence of hyperparameters.
A comprehensive experimental evaluation was conducted considering diverse
well-known public datasets, composed of textual, image, and multimodal
documents. Performed experiments demonstrate that our method reaches top
performance, yielding better effectiveness scores than state-of-the-art
baseline methods and promoting large gains over the rankers being fused, thus
demonstrating the successful capability of the proposal in representing queries
based on a unified graph-based model of rank fusions
Evidence combination for multi-point query learning in content-based image retrieval
In multipoint query learning a number of query representatives are selected based on the positive feedback samples. The similarity score to a multipoint query is obtained from merging the individual scores. In this paper, we investigate three different combination strategies and present a comparative evaluation of their performance. Results show that the performance of multipoint queries relies heavily on the right choice of settings for the fusion. Unlike previous results, suggesting that multipoint queries generally perform better than a single query representation, our evaluation results do not allow such an overall conclusion. Instead our study points to the type of queries for which query expansion is better suited than a single query, and vice versa
A Pose-Sensitive Embedding for Person Re-Identification with Expanded Cross Neighborhood Re-Ranking
Person re identification is a challenging retrieval task that requires
matching a person's acquired image across non overlapping camera views. In this
paper we propose an effective approach that incorporates both the fine and
coarse pose information of the person to learn a discriminative embedding. In
contrast to the recent direction of explicitly modeling body parts or
correcting for misalignment based on these, we show that a rather
straightforward inclusion of acquired camera view and/or the detected joint
locations into a convolutional neural network helps to learn a very effective
representation. To increase retrieval performance, re-ranking techniques based
on computed distances have recently gained much attention. We propose a new
unsupervised and automatic re-ranking framework that achieves state-of-the-art
re-ranking performance. We show that in contrast to the current
state-of-the-art re-ranking methods our approach does not require to compute
new rank lists for each image pair (e.g., based on reciprocal neighbors) and
performs well by using simple direct rank list based comparison or even by just
using the already computed euclidean distances between the images. We show that
both our learned representation and our re-ranking method achieve
state-of-the-art performance on a number of challenging surveillance image and
video datasets.
The code is available online at:
https://github.com/pse-ecn/pose-sensitive-embeddingComment: CVPR 2018: v2 (fixes, added new results on PRW dataset
Exploiting feature representations through similarity learning, post-ranking and ranking aggregation for person re-identification
Person re-identification has received special attention by the human analysis
community in the last few years. To address the challenges in this field, many
researchers have proposed different strategies, which basically exploit either
cross-view invariant features or cross-view robust metrics. In this work, we
propose to exploit a post-ranking approach and combine different feature
representations through ranking aggregation. Spatial information, which
potentially benefits the person matching, is represented using a 2D body model,
from which color and texture information are extracted and combined. We also
consider background/foreground information, automatically extracted via Deep
Decompositional Network, and the usage of Convolutional Neural Network (CNN)
features. To describe the matching between images we use the polynomial feature
map, also taking into account local and global information. The Discriminant
Context Information Analysis based post-ranking approach is used to improve
initial ranking lists. Finally, the Stuart ranking aggregation method is
employed to combine complementary ranking lists obtained from different feature
representations. Experimental results demonstrated that we improve the
state-of-the-art on VIPeR and PRID450s datasets, achieving 67.21% and 75.64% on
top-1 rank recognition rate, respectively, as well as obtaining competitive
results on CUHK01 dataset.Comment: Preprint submitted to Image and Vision Computin
Using Apache Lucene to Search Vector of Locally Aggregated Descriptors
Surrogate Text Representation (STR) is a profitable solution to efficient
similarity search on metric space using conventional text search engines, such
as Apache Lucene. This technique is based on comparing the permutations of some
reference objects in place of the original metric distance. However, the
Achilles heel of STR approach is the need to reorder the result set of the
search according to the metric distance. This forces to use a support database
to store the original objects, which requires efficient random I/O on a fast
secondary memory (such as flash-based storages). In this paper, we propose to
extend the Surrogate Text Representation to specifically address a class of
visual metric objects known as Vector of Locally Aggregated Descriptors (VLAD).
This approach is based on representing the individual sub-vectors forming the
VLAD vector with the STR, providing a finer representation of the vector and
enabling us to get rid of the reordering phase. The experiments on a publicly
available dataset show that the extended STR outperforms the baseline STR
achieving satisfactory performance near to the one obtained with the original
VLAD vectors.Comment: In Proceedings of the 11th Joint Conference on Computer Vision,
Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2016) -
Volume 4: VISAPP, p. 383-39
- …