265 research outputs found
Multi-scale Orderless Pooling of Deep Convolutional Activation Features
Deep convolutional neural networks (CNN) have shown their promise as a
universal representation for recognition. However, global CNN activations lack
geometric invariance, which limits their robustness for classification and
matching of highly variable scenes. To improve the invariance of CNN
activations without degrading their discriminative power, this paper presents a
simple but effective scheme called multi-scale orderless pooling (MOP-CNN).
This scheme extracts CNN activations for local patches at multiple scale
levels, performs orderless VLAD pooling of these activations at each level
separately, and concatenates the result. The resulting MOP-CNN representation
can be used as a generic feature for either supervised or unsupervised
recognition tasks, from image classification to instance-level retrieval; it
consistently outperforms global CNN activations without requiring any joint
training of prediction layers for a particular target dataset. In absolute
terms, it achieves state-of-the-art results on the challenging SUN397 and MIT
Indoor Scenes classification datasets, and competitive results on
ILSVRC2012/2013 classification and INRIA Holidays retrieval datasets
Efficient On-the-fly Category Retrieval using ConvNets and GPUs
We investigate the gains in precision and speed, that can be obtained by
using Convolutional Networks (ConvNets) for on-the-fly retrieval - where
classifiers are learnt at run time for a textual query from downloaded images,
and used to rank large image or video datasets.
We make three contributions: (i) we present an evaluation of state-of-the-art
image representations for object category retrieval over standard benchmark
datasets containing 1M+ images; (ii) we show that ConvNets can be used to
obtain features which are incredibly performant, and yet much lower dimensional
than previous state-of-the-art image representations, and that their
dimensionality can be reduced further without loss in performance by
compression using product quantization or binarization. Consequently, features
with the state-of-the-art performance on large-scale datasets of millions of
images can fit in the memory of even a commodity GPU card; (iii) we show that
an SVM classifier can be learnt within a ConvNet framework on a GPU in parallel
with downloading the new training images, allowing for a continuous refinement
of the model as more images become available, and simultaneous training and
ranking. The outcome is an on-the-fly system that significantly outperforms its
predecessors in terms of: precision of retrieval, memory requirements, and
speed, facilitating accurate on-the-fly learning and ranking in under a second
on a single GPU.Comment: Published in proceedings of ACCV 201
PlaNet - Photo Geolocation with Convolutional Neural Networks
Is it possible to build a system to determine the location where a photo was
taken using just its pixels? In general, the problem seems exceptionally
difficult: it is trivial to construct situations where no location can be
inferred. Yet images often contain informative cues such as landmarks, weather
patterns, vegetation, road markings, and architectural details, which in
combination may allow one to determine an approximate location and occasionally
an exact location. Websites such as GeoGuessr and View from your Window suggest
that humans are relatively good at integrating these cues to geolocate images,
especially en-masse. In computer vision, the photo geolocation problem is
usually approached using image retrieval methods. In contrast, we pose the
problem as one of classification by subdividing the surface of the earth into
thousands of multi-scale geographic cells, and train a deep network using
millions of geotagged images. While previous approaches only recognize
landmarks or perform approximate matching using global image descriptors, our
model is able to use and integrate multiple visible cues. We show that the
resulting model, called PlaNet, outperforms previous approaches and even
attains superhuman levels of accuracy in some cases. Moreover, we extend our
model to photo albums by combining it with a long short-term memory (LSTM)
architecture. By learning to exploit temporal coherence to geolocate uncertain
photos, we demonstrate that this model achieves a 50% performance improvement
over the single-image model
Fatigue modelling for gas nitriding
The present study aims to develop an algorithm able to predict the fatigue lifetime of nitrided steels. Linear multi-axial fatigue criteria are used to take into account the gradients of mechanical properties provided by the nitriding process. Simulations on rotating bending fatigue specimens are made in order to test the nitrided surfaces. The fatigue model is applied to the cyclic loading of a gear from a simulation using the finite element software Ansys. Results show the positive contributions of nitriding on the fatigue strength. 
Compact Deep Aggregation for Set Retrieval
The objective of this work is to learn a compact embedding of a set of
descriptors that is suitable for efficient retrieval and ranking, whilst
maintaining discriminability of the individual descriptors. We focus on a
specific example of this general problem -- that of retrieving images
containing multiple faces from a large scale dataset of images. Here the set
consists of the face descriptors in each image, and given a query for multiple
identities, the goal is then to retrieve, in order, images which contain all
the identities, all but one, \etc
To this end, we make the following contributions: first, we propose a CNN
architecture -- {\em SetNet} -- to achieve the objective: it learns face
descriptors and their aggregation over a set to produce a compact fixed length
descriptor designed for set retrieval, and the score of an image is a count of
the number of identities that match the query; second, we show that this
compact descriptor has minimal loss of discriminability up to two faces per
image, and degrades slowly after that -- far exceeding a number of baselines;
third, we explore the speed vs.\ retrieval quality trade-off for set retrieval
using this compact descriptor; and, finally, we collect and annotate a large
dataset of images containing various number of celebrities, which we use for
evaluation and is publicly released.Comment: 20 page
A Dense-Depth Representation for VLAD descriptors in Content-Based Image Retrieval
The recent advances brought by deep learning allowed to improve the
performance in image retrieval tasks. Through the many convolutional layers,
available in a Convolutional Neural Network (CNN), it is possible to obtain a
hierarchy of features from the evaluated image. At every step, the patches
extracted are smaller than the previous levels and more representative.
Following this idea, this paper introduces a new detector applied on the
feature maps extracted from pre-trained CNN. Specifically, this approach lets
to increase the number of features in order to increase the performance of the
aggregation algorithms like the most famous and used VLAD embedding. The
proposed approach is tested on different public datasets: Holidays, Oxford5k,
Paris6k and UKB
The Role of Local Intrinsic Dimensionality in Benchmarking Nearest Neighbor Search
This paper reconsiders common benchmarking approaches to nearest neighbor
search. It is shown that the concept of local intrinsic dimensionality (LID)
allows to choose query sets of a wide range of difficulty for real-world
datasets. Moreover, the effect of different LID distributions on the running
time performance of implementations is empirically studied. To this end,
different visualization concepts are introduced that allow to get a more
fine-grained overview of the inner workings of nearest neighbor search
principles. The paper closes with remarks about the diversity of datasets
commonly used for nearest neighbor search benchmarking. It is shown that such
real-world datasets are not diverse: results on a single dataset predict
results on all other datasets well.Comment: Preprint of the paper accepted at SISAP 201
Re-ranking for Writer Identification and Writer Retrieval
Automatic writer identification is a common problem in document analysis.
State-of-the-art methods typically focus on the feature extraction step with
traditional or deep-learning-based techniques. In retrieval problems,
re-ranking is a commonly used technique to improve the results. Re-ranking
refines an initial ranking result by using the knowledge contained in the
ranked result, e. g., by exploiting nearest neighbor relations. To the best of
our knowledge, re-ranking has not been used for writer
identification/retrieval. A possible reason might be that publicly available
benchmark datasets contain only few samples per writer which makes a re-ranking
less promising. We show that a re-ranking step based on k-reciprocal nearest
neighbor relationships is advantageous for writer identification, even if only
a few samples per writer are available. We use these reciprocal relationships
in two ways: encode them into new vectors, as originally proposed, or integrate
them in terms of query-expansion. We show that both techniques outperform the
baseline results in terms of mAP on three writer identification datasets
Cr cluster characterization in Cu-Cr-Zr alloy after ECAP processing and aging using SANS and HAADF-STEM
International audienceThe precipitation of nano-sized Cr clusters was investigated in a commercial Cu-1Cr-0.1Zr (wt.%) alloy processed by Equal-Channel Angular Pressing (ECAP) and subsequent aging at 550 °C for 4 hours using small angle neutron scattering (SANS) measurements and high-angle annular dark-field-scanning transmission electron microscopy (HAADF-STEM). The size and volume fraction of nano-sized Cr clusters were estimated using both techniques. These parameters assessed from SANS (d~3.2 nm, Fv~1.1 %) agreed reasonably with those from HAADF-STEM (d ~2.5 nm, Fv~2.3%). Besides nano-sized Cr clusters, HAADF-STEM technique evidenced the presence of rare cuboid and spheroid sub-micronic Cr particles about 380-620 nm mean size. Both techniques did not evidence the presence of intermetallic CuxZry phases within the aging conditions
- …