308 research outputs found

    Person ReID in Different Environment Settings Using Deep Learning Methods

    Full text link
    University of Technology Sydney. Faculty of Engineering and Information Technology.Person Re-identification (Person ReID) is an essential research area in vision-based human image retrieval. It is a technology where the system can automatically identify the same person appearing in different camera views. Most existing works in this area focus on settings where the environment is either kept the same or has tiny fluctuation. However, it is well-known that no matter how small, the degree of environment changes may affect the robustness of a ReID algorithm significantly. Many real-world applications are required to detect the same person at a drastically different place and time, making large environment changes an unavoidable yet under-addressed problem. Hence, we want to address the problem where environment settings are different, such as illuminations, resolutions, modalities and clothing. Specifically, this thesis proposes a series of methods for environment change person ReID, summarized as follows: We proposed a Two-Stream Model which can solve the illumination adaptive person ReID problem. It can separate ReID features from lighting features to enhance ReID performance. We construct two augmented datasets by synthetically changing a set of predefined lighting conditions in two of the most popular ReID benchmarks: Market1501 and DukeMTMC-ReID. Experiments demonstrate that our algorithm outperforms other state-of-the-art works and is particularly potent in handling images under extremely low light. We proposed a Teacher-Student GAN model to solve the cross-modality person ReID problem. It adopts different domains and guides the ReID backbone. Unlike other GAN-based models, the proposed model only needs the backbone module at the test stage, making it more efficient and resource-saving. To showcase our model's capability, we did extensive experiments on the newly-released SYSU-MM01 and RegDB ReID benchmark and achieved superior performance to the state-of-the-art methods. We propose a novel two-stream network that can solve the cross-resolution person ReID problem. It contains a lightweight resolution association ReID feature transformation (RAFT) module and a self-weighted attention (SWA) ReID module to evaluate features under different resolutions. Comprehensive experiments on five benchmarks show the validity of our method. We design a novel unsupervised model, Syn-Person-Cluster ReID, to solve the unlabeled clothing change person ReID problem. We develop a purely unsupervised pipeline equipped with synthetic augmentation on person images and feature restriction for the same person. Extensive experiments on clothing change ReID datasets show the out-performance of our methods

    Illumination Distillation Framework for Nighttime Person Re-Identification and A New Benchmark

    Full text link
    Nighttime person Re-ID (person re-identification in the nighttime) is a very important and challenging task for visual surveillance but it has not been thoroughly investigated. Under the low illumination condition, the performance of person Re-ID methods usually sharply deteriorates. To address the low illumination challenge in nighttime person Re-ID, this paper proposes an Illumination Distillation Framework (IDF), which utilizes illumination enhancement and illumination distillation schemes to promote the learning of Re-ID models. Specifically, IDF consists of a master branch, an illumination enhancement branch, and an illumination distillation module. The master branch is used to extract the features from a nighttime image. The illumination enhancement branch first estimates an enhanced image from the nighttime image using a nonlinear curve mapping method and then extracts the enhanced features. However, nighttime and enhanced features usually contain data noise due to unstable lighting conditions and enhancement failures. To fully exploit the complementary benefits of nighttime and enhanced features while suppressing data noise, we propose an illumination distillation module. In particular, the illumination distillation module fuses the features from two branches through a bottleneck fusion model and then uses the fused features to guide the learning of both branches in a distillation manner. In addition, we build a real-world nighttime person Re-ID dataset, named Night600, which contains 600 identities captured from different viewpoints and nighttime illumination conditions under complex outdoor environments. Experimental results demonstrate that our IDF can achieve state-of-the-art performance on two nighttime person Re-ID datasets (i.e., Night600 and Knight ). We will release our code and dataset at https://github.com/Alexadlu/IDF.Comment: Accepted by TM

    Visible-Infrared Person Re-Identification Using Privileged Intermediate Information

    Full text link
    Visible-infrared person re-identification (ReID) aims to recognize a same person of interest across a network of RGB and IR cameras. Some deep learning (DL) models have directly incorporated both modalities to discriminate persons in a joint representation space. However, this cross-modal ReID problem remains challenging due to the large domain shift in data distributions between RGB and IR modalities. % This paper introduces a novel approach for a creating intermediate virtual domain that acts as bridges between the two main domains (i.e., RGB and IR modalities) during training. This intermediate domain is considered as privileged information (PI) that is unavailable at test time, and allows formulating this cross-modal matching task as a problem in learning under privileged information (LUPI). We devised a new method to generate images between visible and infrared domains that provide additional information to train a deep ReID model through an intermediate domain adaptation. In particular, by employing color-free and multi-step triplet loss objectives during training, our method provides common feature representation spaces that are robust to large visible-infrared domain shifts. % Experimental results on challenging visible-infrared ReID datasets indicate that our proposed approach consistently improves matching accuracy, without any computational overhead at test time. The code is available at: \href{https://github.com/alehdaghi/Cross-Modal-Re-ID-via-LUPI}{https://github.com/alehdaghi/Cross-Modal-Re-ID-via-LUPI

    Methods for data-related problems in person re-ID

    Get PDF
    In the last years, the ever-increasing need for public security has attracted wide attention in person re-ID. State-of-the-art techniques have achieved impressive results on academic datasets, which are nearly saturated. However, when it comes to deploying a re-ID system in a practical surveillance scenario, several challenges arise. 1) Full person views are often unavailable, and missing body parts make the comparison very challenging due to significant misalignment of the views. 2) Low diversity in training data introduces bias in re-ID systems. 3) The available data might come from different modalities, e.g., text and images. This thesis proposes Partial Matching Net (PMN) that detects body joints, aligns partial views, and hallucinates the missing parts based on the information present in the frame and a learned model of a person. The aligned and reconstructed views are then combined into a joint representation and used for matching images. The thesis also investigates different types of bias that typically occur in re-ID scenarios when the similarity between two persons is due to the same pose, body part, or camera view, rather than to the ID-related cues. It proposes a general approach to mitigate these effects named Bias-Control (BC) framework with two training streams leveraging adversarial and multitask learning to reduce bias-related features. Finally, the thesis investigates a novel mechanism for matching data across visual and text modalities. It proposes a framework Text (TAVD) with two complementary modules: Text attribute feature aggregation (TA) that aggregates multiple semantic attributes in a bimodal space for globally matching text descriptions with images and Visual feature decomposition (VD) which performs feature embedding for locally matching image regions with text attributes. The results and comparison to state of the art on different benchmarks show that the proposed solutions are effective strategies for person re-ID.Open Acces

    Learning Discriminative Features for Person Re-Identification

    Get PDF
    For fulfilling the requirements of public safety in modern cities, more and more large-scale surveillance camera systems are deployed, resulting in an enormous amount of visual data. Automatically processing and interpreting these data promote the development and application of visual data analytic technologies. As one of the important research topics in surveillance systems, person re-identification (re-id) aims at retrieving the target person across non-overlapping camera-views that are implemented in a number of distributed space-time locations. It is a fundamental problem for many practical surveillance applications, eg, person search, cross-camera tracking, multi-camera human behavior analysis and prediction, and it received considerable attentions nowadays from both academic and industrial domains. Learning discriminative feature representation is an essential task in person re-id. Although many methodologies have been proposed, discriminative re-id feature extraction is still a challenging problem due to: (1) Intra- and inter-personal variations. The intrinsic properties of the camera deployment in surveillance system lead to various changes in person poses, view-points, illumination conditions etc. This may result in the large intra-personal variations and/or small inter-personal variations, thus incurring problems in matching person images. (2) Domain variations. The domain variations between different datasets give rise to the problem of generalization capability of re-id model. Directly applying a re-id model trained on one dataset to another one usually causes a large performance degradation. (3) Difficulties in data creation and annotation. Existing person re-id methods, especially deep re-id methods, rely mostly on a large set of inter-camera identity labelled training data, requiring a tedious data collection and annotation process. This leads to poor scalability in practical person re-id applications. Corresponding to the challenges in learning discriminative re-id features, this thesis contributes to the re-id domain by proposing three related methodologies and one new re-id setting: (1) Gaussian mixture importance estimation. Handcrafted features are usually not discriminative enough for person re-id because of noisy information, such as background clutters. To precisely evaluate the similarities between person images, the main task of distance metric learning is to filter out the noisy information. Keep It Simple and Straightforward MEtric (KISSME) is an effective method in person re-id. However, it is sensitive to the feature dimensionality and cannot capture the multi-modes in dataset. To this end, a Gaussian Mixture Importance Estimation re-id approach is proposed, which exploits the Gaussian Mixture Models for estimating the observed commonalities of similar and dissimilar person pairs in the feature space. (2) Unsupervised domain-adaptive person re-id based on pedestrian attributes. In person re-id, person identities are usually not overlapped among different domains (or datasets) and this raises the difficulties in generalizing re-id models. Different from person identity, pedestrian attributes, eg., hair length, clothes type and color, are consistent across different domains (or datasets). However, most of re-id datasets lack attribute annotations. On the other hand, in the field of pedestrian attribute recognition, there is a number of datasets labeled with attributes. Exploiting such data for re-id purpose can alleviate the shortage of attribute annotations in re-id domain and improve the generalization capability of re-id model. To this end, an unsupervised domain-adaptive re-id feature learning framework is proposed to make full use of attribute annotations. Specifically, an existing unsupervised domain adaptation method has been extended to transfer attribute-based features from attribute recognition domain to the re-id domain. With the proposed re-id feature learning framework, the domain invariant feature representations can be effectively extracted. (3) Intra-camera supervised person re-id. Annotating the large-scale re-id datasets requires a tedious data collection and annotation process and therefore leads to poor scalability in practical person re-id applications. To overcome this fundamental limitation, a new person re-id setting is considered without inter-camera identity association but only with identity labels independently annotated within each camera-view. This eliminates the most time-consuming and tedious inter-camera identity association annotating process and thus significantly reduces the amount of human efforts required during annotation. It hence gives rise to a more scalable and more feasible learning scenario, which is named as Intra-Camera Supervised (ICS) person re-id. Under this ICS setting, a new re-id method, i.e., Multi-task Mulit-label (MATE) learning method, is formulated. Given no inter-camera association, MATE is specially designed for self-discovering the inter-camera identity correspondence. This is achieved by inter-camera multi-label learning under a joint multi-task inference framework. In addition, MATE can also efficiently learn the discriminative re-id feature representations using the available identity labels within each camera-view
    corecore