6 research outputs found
REMAP: Multi-layer entropy-guided pooling of dense CNN features for image retrieval
This paper addresses the problem of very large-scale image retrieval,
focusing on improving its accuracy and robustness. We target enhanced
robustness of search to factors such as variations in illumination, object
appearance and scale, partial occlusions, and cluttered backgrounds -
particularly important when search is performed across very large datasets with
significant variability. We propose a novel CNN-based global descriptor, called
REMAP, which learns and aggregates a hierarchy of deep features from multiple
CNN layers, and is trained end-to-end with a triplet loss. REMAP explicitly
learns discriminative features which are mutually-supportive and complementary
at various semantic levels of visual abstraction. These dense local features
are max-pooled spatially at each layer, within multi-scale overlapping regions,
before aggregation into a single image-level descriptor. To identify the
semantically useful regions and layers for retrieval, we propose to measure the
information gain of each region and layer using KL-divergence. Our system
effectively learns during training how useful various regions and layers are
and weights them accordingly. We show that such relative entropy-guided
aggregation outperforms classical CNN-based aggregation controlled by SGD. The
entire framework is trained in an end-to-end fashion, outperforming the latest
state-of-the-art results. On image retrieval datasets Holidays, Oxford and
MPEG, the REMAP descriptor achieves mAP of 95.5%, 91.5%, and 80.1%
respectively, outperforming any results published to date. REMAP also formed
the core of the winning submission to the Google Landmark Retrieval Challenge
on Kaggle.Comment: Submitted to IEEE Trans. Image Processing on 24 May 2018, published
22 May 201
On aggregation of local binary descriptors
This paper addresses the problem of aggregating local binary descriptors for large scale image retrieval in mobile scenarios. Binary descriptors are becoming increasingly popular, especially in mobile applications, as they deliver high matching speed, have a small memory footprint and are fast to extract. However, little research has been done on how to efficiently aggregate binary descriptors. Direct application of methods developed for conventional descriptors, such as SIFT, results in unsatisfactory performance. In this paper we introduce and evaluate several algorithms to compress high-dimensional binary local descriptors, for efficient retrieval in large databases. In addition, we propose a robust global image representation; Binary Robust Visual Descriptor (B-RVD), with rank-based multi-assignment of local descriptors and direction-based aggregation, achieved by the use of L1-norm on residual vectors. The performance of the B-RVD is further improved by balancing the variances of residual vector directions in order to maximize the discriminatory power of the aggregated vectors. Standard datasets and measures have been used for evaluation showing significant improvement of around 4% mean Average Precision as compared to the state-of-the-art
Single-cell Subcellular Protein Localisation Using Novel Ensembles of Diverse Deep Architectures
Unravelling protein distributions within individual cells is key to
understanding their function and state and indispensable to developing new
treatments. Here we present the Hybrid subCellular Protein Localiser (HCPL),
which learns from weakly labelled data to robustly localise single-cell
subcellular protein patterns. It comprises innovative DNN architectures
exploiting wavelet filters and learnt parametric activations that successfully
tackle drastic cell variability. HCPL features correlation-based ensembling of
novel architectures that boosts performance and aids generalisation.
Large-scale data annotation is made feasible by our "AI-trains-AI" approach,
which determines the visual integrity of cells and emphasises reliable labels
for efficient training. In the Human Protein Atlas context, we demonstrate that
HCPL defines state-of-the-art in the single-cell classification of protein
localisation patterns. To better understand the inner workings of HCPL and
assess its biological relevance, we analyse the contributions of each system
component and dissect the emergent features from which the localisation
predictions are derived