4 research outputs found

    Diverse Knowledge Distillation for End-to-End Person Search

    Full text link
    Person search aims to localize and identify a specific person from a gallery of images. Recent methods can be categorized into two groups, i.e., two-step and end-to-end approaches. The former views person search as two independent tasks and achieves dominant results using separately trained person detection and re-identification (Re-ID) models. The latter performs person search in an end-to-end fashion. Although the end-to-end approaches yield higher inference efficiency, they largely lag behind those two-step counterparts in terms of accuracy. In this paper, we argue that the gap between the two kinds of methods is mainly caused by the Re-ID sub-networks of end-to-end methods. To this end, we propose a simple yet strong end-to-end network with diverse knowledge distillation to break the bottleneck. We also design a spatial-invariant augmentation to assist model to be invariant to inaccurate detection results. Experimental results on the CUHK-SYSU and PRW datasets demonstrate the superiority of our method against existing approaches -- it achieves on par accuracy with state-of-the-art two-step methods while maintaining high efficiency due to the single joint model. Code is available at: https://git.io/DKD-PersonSearch.Comment: Accepted to AAAI, 2021. Code is available at: https://git.io/DKD-PersonSearc

    Ground-to-Aerial Person Search: Benchmark Dataset and Approach

    Full text link
    In this work, we construct a large-scale dataset for Ground-to-Aerial Person Search, named G2APS, which contains 31,770 images of 260,559 annotated bounding boxes for 2,644 identities appearing in both of the UAVs and ground surveillance cameras. To our knowledge, this is the first dataset for cross-platform intelligent surveillance applications, where the UAVs could work as a powerful complement for the ground surveillance cameras. To more realistically simulate the actual cross-platform Ground-to-Aerial surveillance scenarios, the surveillance cameras are fixed about 2 meters above the ground, while the UAVs capture videos of persons at different location, with a variety of view-angles, flight attitudes and flight modes. Therefore, the dataset has the following unique characteristics: 1) drastic view-angle changes between query and gallery person images from cross-platform cameras; 2) diverse resolutions, poses and views of the person images under 9 rich real-world scenarios. On basis of the G2APS benchmark dataset, we demonstrate detailed analysis about current two-step and end-to-end person search methods, and further propose a simple yet effective knowledge distillation scheme on the head of the ReID network, which achieves state-of-the-art performances on both of the G2APS and the previous two public person search datasets, i.e., PRW and CUHK-SYSU. The dataset and source code available on \url{https://github.com/yqc123456/HKD_for_person_search}.Comment: Accepted by ACM MM 202

    Échantillonnage sĂ©mantique et classification des couleurs pour la recherche de personnes dans les vidĂ©os

    Get PDF
    RÉSUMÉ : Dans le cadre de ce travail, nous nous intĂ©ressons Ă  l’application de la recherche de personnes par une description par mots-clefs, dans des images issues de vidĂ©os de sĂ©curitĂ©. Avec ce type d’application, nous cherchons Ă  dĂ©crire les personnes prĂ©sentes dans des vidĂ©os, selon des caractĂ©ristiques saillantes (e.g. couleurs des vĂȘtements), sans nous intĂ©resser Ă  leurs identitĂ©s qui sont inconnues a priori. Il s’agit ainsi d’une approche similaire Ă  la gĂ©nĂ©ration de "portraits robots", qui peuvent alors ĂȘtre attachĂ©s aux images comme mĂ©ta-donnĂ©es, facilitant ainsi la recherche de profils particuliers dans des vidĂ©os (e.g. on cherche une personne avec un pantalon vert et un gilet bleu). Dans un premier temps, nous identifions comme caractĂ©ristiques saillantes les vĂȘtements, et en particulier nous cherchons Ă  dĂ©crire prĂ©cisĂ©ment leurs couleurs. L’objectif de notre travail est ainsi le dĂ©veloppement d’une mĂ©thode de classification des couleurs des vĂȘtements dans les images qui se veut le plus proche possible de la perception humaine. Pour cela, nous nous intĂ©ressons Ă  la fois au vocabulaire utilisĂ©, qui se doit d’ĂȘtre suffisamment gĂ©nĂ©raliste, et aux espaces de couleurs considĂ©rĂ©s, qui doivent pouvoir Ă©muler les propriĂ©tĂ©s de la perception des couleurs humaine.----------ABSTRACT : In this work, we look into the application of person search, in the context of images taken from security cameras. In this type of application, we aim to describe the persons present in videos, according to their characteristics, such as the color of their clothes. The application of person search can be likened to using composite portraits, where persons are described by their semantic parts (such as t-shirt, hat, pants..), and one is looking for persons with specific characteristics. In our case, these characteristics can then be attached to images as metadata, facilitating the search for specific profiles in videos, for example, a person with green pants and a blue shirt. In a first part, we identify the clothes being worn as salient characteristics, and in particular we decide to look into describing very precisely their colors. The goal of our work is thus to propose a color classification method for images, that aims to be as close as possible to human perception
    corecore