1,224 research outputs found
Iterative Object and Part Transfer for Fine-Grained Recognition
The aim of fine-grained recognition is to identify sub-ordinate categories in
images like different species of birds. Existing works have confirmed that, in
order to capture the subtle differences across the categories, automatic
localization of objects and parts is critical. Most approaches for object and
part localization relied on the bottom-up pipeline, where thousands of region
proposals are generated and then filtered by pre-trained object/part models.
This is computationally expensive and not scalable once the number of
objects/parts becomes large. In this paper, we propose a nonparametric
data-driven method for object and part localization. Given an unlabeled test
image, our approach transfers annotations from a few similar images retrieved
in the training set. In particular, we propose an iterative transfer strategy
that gradually refine the predicted bounding boxes. Based on the located
objects and parts, deep convolutional features are extracted for recognition.
We evaluate our approach on the widely-used CUB200-2011 dataset and a new and
large dataset called Birdsnap. On both datasets, we achieve better results than
many state-of-the-art approaches, including a few using oracle (manually
annotated) bounding boxes in the test images.Comment: To appear in ICME 2017 as an oral pape
The Application of Two-level Attention Models in Deep Convolutional Neural Network for Fine-grained Image Classification
Fine-grained classification is challenging because categories can only be
discriminated by subtle and local differences. Variances in the pose, scale or
rotation usually make the problem more difficult. Most fine-grained
classification systems follow the pipeline of finding foreground object or
object parts (where) to extract discriminative features (what).
In this paper, we propose to apply visual attention to fine-grained
classification task using deep neural network. Our pipeline integrates three
types of attention: the bottom-up attention that propose candidate patches, the
object-level top-down attention that selects relevant patches to a certain
object, and the part-level top-down attention that localizes discriminative
parts. We combine these attentions to train domain-specific deep nets, then use
it to improve both the what and where aspects. Importantly, we avoid using
expensive annotations like bounding box or part information from end-to-end.
The weak supervision constraint makes our work easier to generalize.
We have verified the effectiveness of the method on the subsets of ILSVRC2012
dataset and CUB200_2011 dataset. Our pipeline delivered significant
improvements and achieved the best accuracy under the weakest supervision
condition. The performance is competitive against other methods that rely on
additional annotations
Dual Skipping Networks
Inspired by the recent neuroscience studies on the left-right asymmetry of
the human brain in processing low and high spatial frequency information, this
paper introduces a dual skipping network which carries out coarse-to-fine
object categorization. Such a network has two branches to simultaneously deal
with both coarse and fine-grained classification tasks. Specifically, we
propose a layer-skipping mechanism that learns a gating network to predict
which layers to skip in the testing stage. This layer-skipping mechanism endows
the network with good flexibility and capability in practice. Evaluations are
conducted on several widely used coarse-to-fine object categorization
benchmarks, and promising results are achieved by our proposed network model.Comment: CVPR 2018 (poster); fix typ
Fine-grained Image Classification by Exploring Bipartite-Graph Labels
Given a food image, can a fine-grained object recognition engine tell "which
restaurant which dish" the food belongs to? Such ultra-fine grained image
recognition is the key for many applications like search by images, but it is
very challenging because it needs to discern subtle difference between classes
while dealing with the scarcity of training data. Fortunately, the ultra-fine
granularity naturally brings rich relationships among object classes. This
paper proposes a novel approach to exploit the rich relationships through
bipartite-graph labels (BGL). We show how to model BGL in an overall
convolutional neural networks and the resulting system can be optimized through
back-propagation. We also show that it is computationally efficient in
inference thanks to the bipartite structure. To facilitate the study, we
construct a new food benchmark dataset, which consists of 37,885 food images
collected from 6 restaurants and totally 975 menus. Experimental results on
this new food and three other datasets demonstrates BGL advances previous works
in fine-grained object recognition. An online demo is available at
http://www.f-zhou.com/fg_demo/
Fine-Grained Image Analysis with Deep Learning: A Survey
Fine-grained image analysis (FGIA) is a longstanding and fundamental problem
in computer vision and pattern recognition, and underpins a diverse set of
real-world applications. The task of FGIA targets analyzing visual objects from
subordinate categories, e.g., species of birds or models of cars. The small
inter-class and large intra-class variation inherent to fine-grained image
analysis makes it a challenging problem. Capitalizing on advances in deep
learning, in recent years we have witnessed remarkable progress in deep
learning powered FGIA. In this paper we present a systematic survey of these
advances, where we attempt to re-define and broaden the field of FGIA by
consolidating two fundamental fine-grained research areas -- fine-grained image
recognition and fine-grained image retrieval. In addition, we also review other
key issues of FGIA, such as publicly available benchmark datasets and related
domain-specific applications. We conclude by highlighting several research
directions and open problems which need further exploration from the community.Comment: Accepted by IEEE TPAM
- …