4,987 research outputs found

    Iterative Object and Part Transfer for Fine-Grained Recognition

    Full text link
    The aim of fine-grained recognition is to identify sub-ordinate categories in images like different species of birds. Existing works have confirmed that, in order to capture the subtle differences across the categories, automatic localization of objects and parts is critical. Most approaches for object and part localization relied on the bottom-up pipeline, where thousands of region proposals are generated and then filtered by pre-trained object/part models. This is computationally expensive and not scalable once the number of objects/parts becomes large. In this paper, we propose a nonparametric data-driven method for object and part localization. Given an unlabeled test image, our approach transfers annotations from a few similar images retrieved in the training set. In particular, we propose an iterative transfer strategy that gradually refine the predicted bounding boxes. Based on the located objects and parts, deep convolutional features are extracted for recognition. We evaluate our approach on the widely-used CUB200-2011 dataset and a new and large dataset called Birdsnap. On both datasets, we achieve better results than many state-of-the-art approaches, including a few using oracle (manually annotated) bounding boxes in the test images.Comment: To appear in ICME 2017 as an oral pape

    Learning Fashion Compatibility with Bidirectional LSTMs

    Full text link
    The ubiquity of online fashion shopping demands effective recommendation services for customers. In this paper, we study two types of fashion recommendation: (i) suggesting an item that matches existing components in a set to form a stylish outfit (a collection of fashion items), and (ii) generating an outfit with multimodal (images/text) specifications from a user. To this end, we propose to jointly learn a visual-semantic embedding and the compatibility relationships among fashion items in an end-to-end fashion. More specifically, we consider a fashion outfit to be a sequence (usually from top to bottom and then accessories) and each item in the outfit as a time step. Given the fashion items in an outfit, we train a bidirectional LSTM (Bi-LSTM) model to sequentially predict the next item conditioned on previous ones to learn their compatibility relationships. Further, we learn a visual-semantic space by regressing image features to their semantic representations aiming to inject attribute and category information as a regularization for training the LSTM. The trained network can not only perform the aforementioned recommendations effectively but also predict the compatibility of a given outfit. We conduct extensive experiments on our newly collected Polyvore dataset, and the results provide strong qualitative and quantitative evidence that our framework outperforms alternative methods.Comment: ACM MM 1

    Multi-scale Deep Learning Architectures for Person Re-identification

    Full text link
    Person Re-identification (re-id) aims to match people across non-overlapping camera views in a public space. It is a challenging problem because many people captured in surveillance videos wear similar clothes. Consequently, the differences in their appearance are often subtle and only detectable at the right location and scales. Existing re-id models, particularly the recently proposed deep learning based ones match people at a single scale. In contrast, in this paper, a novel multi-scale deep learning model is proposed. Our model is able to learn deep discriminative feature representations at different scales and automatically determine the most suitable scales for matching. The importance of different spatial locations for extracting discriminative features is also learned explicitly. Experiments are carried out to demonstrate that the proposed model outperforms the state-of-the art on a number of benchmarksComment: 9 pages, 3 figures, accepted by ICCV 201

    Evolutionary hypergame dynamics

    Full text link
    A common assumption employed in most previous works on evolutionary game dynamics is that every individual player has full knowledge about and full access to the complete set of available strategies. In realistic social, economical, and political systems, diversity in the knowledge, experience, and background among the individuals can be expected. Games in which the players do not have an identical strategy set are hypergames. Studies of hypergame dynamics have been scarce, especially those on networks. We investigate evolutionary hypergame dynamics on regular lattices using a prototypical model of three available strategies, in which the strategy set of each player contains two of the three strategies. Our computations reveal that more complex dynamical phases emerge from the system than those from the traditional evolutionary game dynamics with full knowledge of the complete set of available strategies, which include single-strategy absorption phases, a cyclic competition (`rock-paper-scissors') type of phase, and an uncertain phase in which the dominant strategy adopted by the population is unpredictable. Exploiting the pair interaction and mean field approximations, we obtain a qualitative understanding of the emergence of the single strategy and uncertain phases. We find the striking phenomenon of strategy revival associated with the cyclic competition phase and provide a qualitative explanation.Our work demonstrates that the diversity in the individuals' strategy set can play an important role in the evolution of strategy distribution in the system. From the point of view of control, the emergence of the complex phases offers the possibility for harnessing evolutionary game dynamics through small changes in individuals' probability of strategy adoption.Comment: 11 pages, 10 figure
    • …
    corecore