142 research outputs found

    Deep learning approaches to person re-identification

    Full text link
    University of Technology Sydney. Faculty of Engineering and Information Technology.Recent years witnessed a dramatic increase of the surveillance cameras in the city. There is thus an urgent demand for person re-identification (re-ID) algorithms. Person re-identification aims to find the target person in other non-overlapping camera views, which is critical in practical applications. In this thesis, I present my research on person re-ID in three settings: supervised re-ID, one-example re-ID and unsupervised re-ID. For supervised setting, a re-ranking algorithm is introduced that can improve the existing re-ID results with Bayesian query expansion. We also investigate pedestrian attributes for re-ID that learns a re-ID embedding and at the same time predicts pedestrian attributes. Since supervised methods require a large amount of annotated training data, which is expensive and not applicable for real-world applications, two re-ID methods on the one-example setting are studied. We also propose an unsupervised re-ID method that jointly optimizes a CNN model and the relationship among the individual samples. The experimental results demonstrate that our algorithm is not only superior to state-of-the-art unsupervised re-ID approaches but also performs favourably than competing transfer learning and semi-supervised learning methods. Finally, I make conclusions on my work and put forward some future directions on the re-ID task

    Visible-Infrared Person Re-Identification via Patch-Mixed Cross-Modality Learning

    Full text link
    Visible-infrared person re-identification (VI-ReID) aims to retrieve images of the same pedestrian from different modalities, where the challenges lie in the significant modality discrepancy. To alleviate the modality gap, recent methods generate intermediate images by GANs, grayscaling, or mixup strategies. However, these methods could ntroduce extra noise, and the semantic correspondence between the two modalities is not well learned. In this paper, we propose a Patch-Mixed Cross-Modality framework (PMCM), where two images of the same person from two modalities are split into patches and stitched into a new one for model learning. In this way, the modellearns to recognize a person through patches of different styles, and the modality semantic correspondence is directly embodied. With the flexible image generation strategy, the patch-mixed images freely adjust the ratio of different modality patches, which could further alleviate the modality imbalance problem. In addition, the relationship between identity centers among modalities is explored to further reduce the modality variance, and the global-to-part constraint is introduced to regularize representation learning of part features. On two VI-ReID datasets, we report new state-of-the-art performance with the proposed method.Comment: IJCAI2

    Computer Simulation of Bioprocess

    Get PDF
    Bioprocess optimization is important in order to make the bioproduction process more efficient and economic. The conventional optimization methods are costly and less efficient. On the other hand, modeling and computer simulation can reveal the mechanisms behind the phenomenon to some extent, to assist the deep analysis and efficient optimization of bioprocesses. In this chapter, modeling and computer simulation of microbial growth and metabolism kinetics, bioreactor dynamics, bioreactor feedback control will be made to show the application methods and the usefulness of modeling and computer simulation methods in optimization of the bioprocess technology

    Improving Person Re-identification by Attribute and Identity Learning

    Full text link
    Person re-identification (re-ID) and attribute recognition share a common target at learning pedestrian descriptions. Their difference consists in the granularity. Most existing re-ID methods only take identity labels of pedestrians into consideration. However, we find the attributes, containing detailed local descriptions, are beneficial in allowing the re-ID model to learn more discriminative feature representations. In this paper, based on the complementarity of attribute labels and ID labels, we propose an attribute-person recognition (APR) network, a multi-task network which learns a re-ID embedding and at the same time predicts pedestrian attributes. We manually annotate attribute labels for two large-scale re-ID datasets, and systematically investigate how person re-ID and attribute recognition benefit from each other. In addition, we re-weight the attribute predictions considering the dependencies and correlations among the attributes. The experimental results on two large-scale re-ID benchmarks demonstrate that by learning a more discriminative representation, APR achieves competitive re-ID performance compared with the state-of-the-art methods. We use APR to speed up the retrieval process by ten times with a minor accuracy drop of 2.92% on Market-1501. Besides, we also apply APR on the attribute recognition task and demonstrate improvement over the baselines.Comment: Accepted to Pattern Recognition (PR

    Revisit Weakly-Supervised Audio-Visual Video Parsing from the Language Perspective

    Full text link
    We focus on the weakly-supervised audio-visual video parsing task (AVVP), which aims to identify and locate all the events in audio/visual modalities. Previous works only concentrate on video-level overall label denoising across modalities, but overlook the segment-level label noise, where adjacent video segments (i.e., 1-second video clips) may contain different events. However, recognizing events in the segment is challenging because its label could be any combination of events that occur in the video. To address this issue, we consider tackling AVVP from the language perspective, since language could freely describe how various events appear in each segment beyond fixed labels. Specifically, we design language prompts to describe all cases of event appearance for each video. Then, the similarity between language prompts and segments is calculated, where the event of the most similar prompt is regarded as the segment-level label. In addition, to deal with the mislabeled segments, we propose to perform dynamic re-weighting on the unreliable segments to adjust their labels. Experiments show that our simple yet effective approach outperforms state-of-the-art methods by a large margin
    • …
    corecore