Search CORE

211 research outputs found

Cross-dimensional Weighting for Aggregated Deep Convolutional Features

Author: A Babenko
A Mikulík
F Perronnin
G Tolias
G Tolias
H Jegou
PH Gosselin
Y Gong
Yannis Avrithis
Publication venue
Publication date: 29/07/2016
Field of study

We propose a simple and straightforward way of creating powerful image representations via cross-dimensional weighting and aggregation of deep convolutional neural network layer outputs. We first present a generalized framework that encompasses a broad family of approaches and includes cross-dimensional pooling and weighting steps. We then propose specific non-parametric schemes for both spatial- and channel-wise weighting that boost the effect of highly active spatial responses and at the same time regulate burstiness effects. We experiment on different public datasets for image search and show that our approach outperforms the current state-of-the-art for approaches based on pre-trained networks. We also provide an easy-to-use, open source implementation that reproduces our results.Comment: Accepted for publications at the 4th Workshop on Web-scale Vision and Social Media (VSM), ECCV 201

arXiv.org e-Print Archive

Crossref

Generalized Max Pooling

Author: Murray Naila
Perronnin Florent
Publication venue
Publication date: 01/01/2014
Field of study

State-of-the-art patch-based image representations involve a pooling operation that aggregates statistics computed from local descriptors. Standard pooling operations include sum- and max-pooling. Sum-pooling lacks discriminability because the resulting representation is strongly influenced by frequent yet often uninformative descriptors, but only weakly influenced by rare yet potentially highly-informative ones. Max-pooling equalizes the influence of frequent and rare descriptors but is only applicable to representations that rely on count statistics, such as the bag-of-visual-words (BOV) and its soft- and sparse-coding extensions. We propose a novel pooling mechanism that achieves the same effect as max-pooling but is applicable beyond the BOV and especially to the state-of-the-art Fisher Vector -- hence the name Generalized Max Pooling (GMP). It involves equalizing the similarity between each patch and the pooled representation, which is shown to be equivalent to re-weighting the per-patch statistics. We show on five public image classification benchmarks that the proposed GMP can lead to significant performance gains with respect to heuristic alternatives.Comment: (to appear) CVPR 2014 - IEEE Conference on Computer Vision & Pattern Recognition (2014

arXiv.org e-Print Archive

CiteSeerX

Crossref

Selective Deep Convolutional Features for Image Retrieval

Author: Cheung Ngai-Man
Do Thanh-Toan
Hoang Tuan
Tan Dang-Khoa Le
Publication venue
Publication date: 27/11/2017
Field of study

Convolutional Neural Network (CNN) is a very powerful approach to extract discriminative local descriptors for effective image search. Recent work adopts fine-tuned strategies to further improve the discriminative power of the descriptors. Taking a different approach, in this paper, we propose a novel framework to achieve competitive retrieval performance. Firstly, we propose various masking schemes, namely SIFT-mask, SUM-mask, and MAX-mask, to select a representative subset of local convolutional features and remove a large number of redundant features. We demonstrate that this can effectively address the burstiness issue and improve retrieval accuracy. Secondly, we propose to employ recent embedding and aggregating methods to further enhance feature discriminability. Extensive experiments demonstrate that our proposed framework achieves state-of-the-art retrieval accuracy.Comment: Accepted to ACM MM 201

arXiv.org e-Print Archive

Crossref

Insight Centre for Data Analytics (DCU) at TRECVid 2014: instance search and semantic indexing tasks

Author: Albatal Rami
Giró-i-Nieto Xavier
Gurrin Cathal
Hu Feiyan
McGuinness Kevin
Mohedano Eva
O'Connor Noel E.
Salvador Amaia
Smeaton Alan F.
Ventura Carles
Zhang Zhenxing
Publication venue
Publication date: 01/01/2014
Field of study

Insight-DCU participated in the instance search (INS) and semantic indexing (SIN) tasks in 2014. Two very different approaches were submitted for instance search, one based on features extracted using pre-trained deep convolutional neural networks (CNNs), and another based on local SIFT features, large vocabulary visual bag-of-words aggregation, inverted index-based lookup, and geometric verification on the top-N retrieved results. Two interactive runs and two automatic runs were submitted, the best interactive runs achieved a mAP of 0.135 and the best automatic 0.12. Our semantic indexing runs were based also on using convolutional neural network features, and on Support Vector Machine classifiers with linear and RBF kernels. One run was submitted to the main task, two to the no annotation task, and one to the progress task. Data for the no-annotation task was gathered from Google Images and ImageNet. The main task run has achieved a mAP of 0.086, the best no-annotation runs had a close performance to the main run by achieving a mAP of 0.080, while the progress run had 0.043

CiteSeerX

UPCommons. Portal del coneixement obert de la UPC

Irish Universities

DCU Online Research Access Service

Approaches to better context modeling and categorization

Author: Madsen Rasmus Elsborg
Publication venue: Technical University of Denmark
Publication date: 01/03/2006
Field of study

Online Research Database In Technology

Correlation-Based Burstiness for Logo Retrieval

Author: Douze Matthijs
Revaud Jérôme
Schmid Cordelia
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2012
Field of study

International audienceDetecting logos in photos is challenging. A reason is that logos locally resemble patterns frequently seen in random images. We propose to learn a statistical model for the distribution of incorrect detections output by an image matching algorithm. It results in a novel scoring criterion in which the weight of correlated keypoint matches is reduced, penalizing irrelevant logo detections. In experiments on two very diff erent logo retrieval benchmarks, our approach largely improves over the standard matching criterion as well as other state-of-the-art approaches

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Semantic Visual Localization

Author: Geiger Andreas
Pollefeys Marc
Sattler Torsten
Schönberger Johannes L.
Publication venue
Publication date: 01/01/2018
Field of study

Robust visual localization under a wide range of viewing conditions is a fundamental problem in computer vision. Handling the difficult cases of this problem is not only very challenging but also of high practical relevance, e.g., in the context of life-long localization for augmented reality or autonomous robots. In this paper, we propose a novel approach based on a joint 3D geometric and semantic understanding of the world, enabling it to succeed under conditions where previous approaches failed. Our method leverages a novel generative model for descriptor learning, trained on semantic scene completion as an auxiliary task. The resulting 3D descriptors are robust to missing observations by encoding high-level 3D geometric and semantic information. Experiments on several challenging large-scale localization datasets demonstrate reliable localization under extreme viewpoint, illumination, and geometry changes

arXiv.org e-Print Archive

MPG.PuRe