72 research outputs found

    Identity-Seeking Self-Supervised Representation Learning for Generalizable Person Re-identification

    Full text link
    This paper aims to learn a domain-generalizable (DG) person re-identification (ReID) representation from large-scale videos \textbf{without any annotation}. Prior DG ReID methods employ limited labeled data for training due to the high cost of annotation, which restricts further advances. To overcome the barriers of data and annotation, we propose to utilize large-scale unsupervised data for training. The key issue lies in how to mine identity information. To this end, we propose an Identity-seeking Self-supervised Representation learning (ISR) method. ISR constructs positive pairs from inter-frame images by modeling the instance association as a maximum-weight bipartite matching problem. A reliability-guided contrastive loss is further presented to suppress the adverse impact of noisy positive pairs, ensuring that reliable positive pairs dominate the learning process. The training cost of ISR scales approximately linearly with the data size, making it feasible to utilize large-scale data for training. The learned representation exhibits superior generalization ability. \textbf{Without human annotation and fine-tuning, ISR achieves 87.0\% Rank-1 on Market-1501 and 56.4\% Rank-1 on MSMT17}, outperforming the best supervised domain-generalizable method by 5.0\% and 19.5\%, respectively. In the pre-training→\rightarrowfine-tuning scenario, ISR achieves state-of-the-art performance, with 88.4\% Rank-1 on MSMT17. The code is at \url{https://github.com/dcp15/ISR_ICCV2023_Oral}.Comment: ICCV 2023 Ora

    Softmax Dissection: Towards Understanding Intra- and Inter-class Objective for Embedding Learning

    Full text link
    The softmax loss and its variants are widely used as objectives for embedding learning, especially in applications like face recognition. However, the intra- and inter-class objectives in the softmax loss are entangled, therefore a well-optimized inter-class objective leads to relaxation on the intra-class objective, and vice versa. In this paper, we propose to dissect the softmax loss into independent intra- and inter-class objective (D-Softmax). With D-Softmax as objective, we can have a clear understanding of both the intra- and inter-class objective, therefore it is straightforward to tune each part to the best state. Furthermore, we find the computation of the inter-class objective is redundant and propose two sampling-based variants of D-Softmax to reduce the computation cost. Training with regular-scale data, experiments in face verification show D-Softmax is favorably comparable to existing losses such as SphereFace and ArcFace. Training with massive-scale data, experiments show the fast variants of D-Softmax significantly accelerates the training process (such as 64x) with only a minor sacrifice in performance, outperforming existing acceleration methods of softmax in terms of both performance and efficiency.Comment: Accepted to AAAI-2020, Oral presentatio

    How to Synthesize a Large-Scale and Trainable Micro-Expression Dataset?

    Full text link
    This paper does not contain technical novelty but introduces our key discoveries in a data generation protocol, a database and insights. We aim to address the lack of large-scale datasets in micro-expression (MiE) recognition due to the prohibitive cost of data collection, which renders large-scale training less feasible. To this end, we develop a protocol to automatically synthesize large scale MiE training data that allow us to train improved recognition models for real-world test data. Specifically, we discover three types of Action Units (AUs) that can constitute trainable MiEs. These AUs come from real-world MiEs, early frames of macro-expression videos, and the relationship between AUs and expression categories defined by human expert knowledge. With these AUs, our protocol then employs large numbers of face images of various identities and an off-the-shelf face generator for MiE synthesis, yielding the MiE-X dataset. MiE recognition models are trained or pre-trained on MiE-X and evaluated on real-world test sets, where very competitive accuracy is obtained. Experimental results not only validate the effectiveness of the discovered AUs and MiE-X dataset but also reveal some interesting properties of MiEs: they generalize across faces, are close to early-stage macro-expressions, and can be manually defined.Comment: European Conference on Computer Vision 202

    Generalizable Re-Identification from Videos with Cycle Association

    Full text link
    In this paper, we are interested in learning a generalizable person re-identification (re-ID) representation from unlabeled videos. Compared with 1) the popular unsupervised re-ID setting where the training and test sets are typically under the same domain, and 2) the popular domain generalization (DG) re-ID setting where the training samples are labeled, our novel scenario combines their key challenges: the training samples are unlabeled, and collected form various domains which do no align with the test domain. In other words, we aim to learn a representation in an unsupervised manner and directly use the learned representation for re-ID in novel domains. To fulfill this goal, we make two main contributions: First, we propose Cycle Association (CycAs), a scalable self-supervised learning method for re-ID with low training complexity; and second, we construct a large-scale unlabeled re-ID dataset named LMP-video, tailored for the proposed method. Specifically, CycAs learns re-ID features by enforcing cycle consistency of instance association between temporally successive video frame pairs, and the training cost is merely linear to the data size, making large-scale training possible. On the other hand, the LMP-video dataset is extremely large, containing 50 million unlabeled person images cropped from over 10K Youtube videos, therefore is sufficient to serve as fertile soil for self-supervised learning. Trained on LMP-video, we show that CycAs learns good generalization towards novel domains. The achieved results sometimes even outperform supervised domain generalizable models. Remarkably, CycAs achieves 82.2% Rank-1 on Market-1501 and 49.0% Rank-1 on MSMT17 with zero human annotation, surpassing state-of-the-art supervised DG re-ID methods. Moreover, we also demonstrate the superiority of CycAs under the canonical unsupervised re-ID and the pretrain-and-finetune scenarios

    MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation

    Full text link
    Perception systems in modern autonomous driving vehicles typically take inputs from complementary multi-modal sensors, e.g., LiDAR and cameras. However, in real-world applications, sensor corruptions and failures lead to inferior performances, thus compromising autonomous safety. In this paper, we propose a robust framework, called MetaBEV, to address extreme real-world environments involving overall six sensor corruptions and two extreme sensor-missing situations. In MetaBEV, signals from multiple sensors are first processed by modal-specific encoders. Subsequently, a set of dense BEV queries are initialized, termed meta-BEV. These queries are then processed iteratively by a BEV-Evolving decoder, which selectively aggregates deep features from either LiDAR, cameras, or both modalities. The updated BEV representations are further leveraged for multiple 3D prediction tasks. Additionally, we introduce a new M2oE structure to alleviate the performance drop on distinct tasks in multi-task joint learning. Finally, MetaBEV is evaluated on the nuScenes dataset with 3D object detection and BEV map segmentation tasks. Experiments show MetaBEV outperforms prior arts by a large margin on both full and corrupted modalities. For instance, when the LiDAR signal is missing, MetaBEV improves 35.5% detection NDS and 17.7% segmentation mIoU upon the vanilla BEVFusion model; and when the camera signal is absent, MetaBEV still achieves 69.2% NDS and 53.7% mIoU, which is even higher than previous works that perform on full-modalities. Moreover, MetaBEV performs fairly against previous methods in both canonical perception and multi-task learning settings, refreshing state-of-the-art nuScenes BEV map segmentation with 70.4% mIoU.Comment: Project page: https://chongjiange.github.io/metabev.htm

    Extracellular Vesicle-Mediated Communication Within Host-Parasite Interactions

    Get PDF
    Extracellular vesicles (EVs) are small membrane-surrounded structures released by different kinds of cells (normal, diseased, and transformed cells) in vivo and in vitro that contain large amounts of important substances (such as lipids, proteins, metabolites, DNA, RNA, and non-coding RNA (ncRNA), including miRNA, lncRNA, tRNA, rRNA, snoRNA, and scaRNA) in an evolutionarily conserved manner. EVs, including exosomes, play a role in the transmission of information, and substances between cells that is increasingly being recognized as important. In some infectious diseases such as parasitic diseases, EVs have emerged as a ubiquitous mechanism for mediating communication during host-parasite interactions. EVs can enable multiple modes to transfer virulence factors and effector molecules from parasites to hosts, thereby regulating host gene expression, and immune responses and, consequently, mediating the pathogenic process, which has made us rethink our understanding of the host-parasite interface. Thus, here, we review the present findings regarding EVs (especially exosomes) and recognize the role of EVs in host-parasite interactions. We hope that a better understanding of the mechanisms of parasite-derived EVs may provide new insights for further diagnostic biomarker, vaccine, and therapeutic development

    A Sir2-Like Protein Participates in Mycobacterial NHEJ

    Get PDF
    In eukaryotic cells, repair of DNA double-strand breaks (DSBs) by the nonhomologous end-joining (NHEJ) pathway is critical for genome stability. In contrast to the complex eukaryotic repair system, bacterial NHEJ apparatus consists of only two proteins, Ku and a multifunctional DNA ligase (LigD), whose functional mechanism has not been fully clarified. We show here for the first time that Sir2 is involved in the mycobacterial NHEJ repair pathway. Here, using tandem affinity purification (TAP) screening, we have identified an NAD-dependent deacetylase in mycobacteria which is a homologue of the eukaryotic Sir2 protein and interacts directly with Ku. Results from an in vitro glutathione S-transferase (GST) pull-down assay suggest that Sir2 interacts directly with LigD. Plasmid-based end-joining assays revealed that the efficiency of DSB repair in a sir2 deletion mutant was reduced 2-fold. Moreover, the Ξ”sir2 strain was about 10-fold more sensitive to ionizing radiation (IR) in the stationary phase than the wild-type. Our results suggest that Sir2 may function closely together with Ku and LigD in the nonhomologous end-joining pathway in mycobacteria
    • …
    corecore