689 research outputs found

    Iterative Object and Part Transfer for Fine-Grained Recognition

    Full text link
    The aim of fine-grained recognition is to identify sub-ordinate categories in images like different species of birds. Existing works have confirmed that, in order to capture the subtle differences across the categories, automatic localization of objects and parts is critical. Most approaches for object and part localization relied on the bottom-up pipeline, where thousands of region proposals are generated and then filtered by pre-trained object/part models. This is computationally expensive and not scalable once the number of objects/parts becomes large. In this paper, we propose a nonparametric data-driven method for object and part localization. Given an unlabeled test image, our approach transfers annotations from a few similar images retrieved in the training set. In particular, we propose an iterative transfer strategy that gradually refine the predicted bounding boxes. Based on the located objects and parts, deep convolutional features are extracted for recognition. We evaluate our approach on the widely-used CUB200-2011 dataset and a new and large dataset called Birdsnap. On both datasets, we achieve better results than many state-of-the-art approaches, including a few using oracle (manually annotated) bounding boxes in the test images.Comment: To appear in ICME 2017 as an oral pape

    Multi-scale Deep Learning Architectures for Person Re-identification

    Full text link
    Person Re-identification (re-id) aims to match people across non-overlapping camera views in a public space. It is a challenging problem because many people captured in surveillance videos wear similar clothes. Consequently, the differences in their appearance are often subtle and only detectable at the right location and scales. Existing re-id models, particularly the recently proposed deep learning based ones match people at a single scale. In contrast, in this paper, a novel multi-scale deep learning model is proposed. Our model is able to learn deep discriminative feature representations at different scales and automatically determine the most suitable scales for matching. The importance of different spatial locations for extracting discriminative features is also learned explicitly. Experiments are carried out to demonstrate that the proposed model outperforms the state-of-the art on a number of benchmarksComment: 9 pages, 3 figures, accepted by ICCV 201
    • …
    corecore