183,591 research outputs found

    Joint Learning of Body and Part Representation for Person Re-Identification

    Full text link
    © 2013 IEEE. Person re-identification (ReID), aiming to identify people among multiple camera views, has attracted an increasing attention due to the potential of application in surveillance security. Large variations in subjects' postures, view angles, and illuminating conditions as well as non-ideal human detection significantly increase the difficulty of person ReID. Learning a robust metric for measuring the similarity between different person images is another under-addressed problem. In this paper, following the recent success of part-based models, in order to generate a discriminative and robust feature representation, we first propose to learn global and weighted local body-part features from pedestrian images. Then, in the training phase, angular loss and part-level classification loss are employed jointly as a similarity measure to train the network, which significantly improves the robustness of the resultant network against feature variance. Experimental results on several benchmark data sets demonstrate that our method outperforms the state-of-the-art methods

    Pose-Guided Multi-Granularity Attention Network for Text-Based Person Search

    Full text link
    Text-based person search aims to retrieve the corresponding person images in an image database by virtue of a describing sentence about the person, which poses great potential for various applications such as video surveillance. Extracting visual contents corresponding to the human description is the key to this cross-modal matching problem. Moreover, correlated images and descriptions involve different granularities of semantic relevance, which is usually ignored in previous methods. To exploit the multilevel corresponding visual contents, we propose a pose-guided multi-granularity attention network (PMA). Firstly, we propose a coarse alignment network (CA) to select the related image regions to the global description by a similarity-based attention. To further capture the phrase-related visual body part, a fine-grained alignment network (FA) is proposed, which employs pose information to learn latent semantic alignment between visual body part and textual noun phrase. To verify the effectiveness of our model, we perform extensive experiments on the CUHK Person Description Dataset (CUHK-PEDES) which is currently the only available dataset for text-based person search. Experimental results show that our approach outperforms the state-of-the-art methods by 15 \% in terms of the top-1 metric.Comment: published in AAAI2020(oral

    Person Re-Identification by Deep Joint Learning of Multi-Loss Classification

    Full text link
    Existing person re-identification (re-id) methods rely mostly on either localised or global feature representation alone. This ignores their joint benefit and mutual complementary effects. In this work, we show the advantages of jointly learning local and global features in a Convolutional Neural Network (CNN) by aiming to discover correlated local and global features in different context. Specifically, we formulate a method for joint learning of local and global feature selection losses designed to optimise person re-id when using only generic matching metrics such as the L2 distance. We design a novel CNN architecture for Jointly Learning Multi-Loss (JLML) of local and global discriminative feature optimisation subject concurrently to the same re-id labelled information. Extensive comparative evaluations demonstrate the advantages of this new JLML model for person re-id over a wide range of state-of-the-art re-id methods on five benchmarks (VIPeR, GRID, CUHK01, CUHK03, Market-1501).Comment: Accepted by IJCAI 201
    • …
    corecore