283 research outputs found

    Deep Metric Learning Meets Deep Clustering: An Novel Unsupervised Approach for Feature Embedding

    Full text link
    Unsupervised Deep Distance Metric Learning (UDML) aims to learn sample similarities in the embedding space from an unlabeled dataset. Traditional UDML methods usually use the triplet loss or pairwise loss which requires the mining of positive and negative samples w.r.t. anchor data points. This is, however, challenging in an unsupervised setting as the label information is not available. In this paper, we propose a new UDML method that overcomes that challenge. In particular, we propose to use a deep clustering loss to learn centroids, i.e., pseudo labels, that represent semantic classes. During learning, these centroids are also used to reconstruct the input samples. It hence ensures the representativeness of centroids - each centroid represents visually similar samples. Therefore, the centroids give information about positive (visually similar) and negative (visually dissimilar) samples. Based on pseudo labels, we propose a novel unsupervised metric loss which enforces the positive concentration and negative separation of samples in the embedding space. Experimental results on benchmarking datasets show that the proposed approach outperforms other UDML methods.Comment: Accepted in BMVC 202

    Application of Self-Supervised Learning to MICA Model for Reconstructing Imperfect 3D Facial Structures

    Full text link
    In this study, we emphasize the integration of a pre-trained MICA model with an imperfect face dataset, employing a self-supervised learning approach. We present an innovative method for regenerating flawed facial structures, yielding 3D printable outputs that effectively support physicians in their patient treatment process. Our results highlight the model's capacity for concealing scars and achieving comprehensive facial reconstructions without discernible scarring. By capitalizing on pre-trained models and necessitating only a few hours of supplementary training, our methodology adeptly devises an optimal model for reconstructing damaged and imperfect facial features. Harnessing contemporary 3D printing technology, we institute a standardized protocol for fabricating realistic, camouflaging mask models for patients in a laboratory environment

    Glenn Research Center Quantum Communicator Receiver Design and Development

    Get PDF
    We investigate, design, and develop a prototype real-time synchronous receiver for the second-generation quantum communicator recently developed at the National Aeronautics and Space Administration (NASA) Glenn Research Center. This communication system exploits the temporal coincidences between simultaneously fired low-power laser sources to communicate at power levels several orders of magnitude less than what is currently achievable through classical means, with the ultimate goal of creating ultra-low-power microsize optical communications and sensing devices. The proposed receiver uses a unique adaptation of the early-late gate method for symbol synchronization and a newly identified 31-bit synchronization word for frame synchronization. This receiver, implemented in a field-programmable gate array (FPGA), also provides a number of significant additional features over the existing non-real-time experimental receiver, such as real-time bit error rate (BER) statistics collection and display, and recovery and display of embedded textual information. It also exhibits an indefinite run time and statistics collection. (c) 2009 Society of Photo-Optical Instrumentation Engineers

    Glenn Research Center Quantum Communicator Receiver Design and Development

    Get PDF
    We investigate, design, and develop a prototype real-time synchronous receiver for the second-generation quantum communicator recently developed at the National Aeronautics and Space Administration (NASA) Glenn Research Center. This communication system exploits the temporal coincidences between simultaneously fired low-power laser sources to communicate at power levels several orders of magnitude less than what is currently achievable through classical means, with the ultimate goal of creating ultra-low-power microsize optical communications and sensing devices. The proposed receiver uses a unique adaptation of the early-late gate method for symbol synchronization and a newly identified 31-bit synchronization word for frame synchronization. This receiver, implemented in a field-programmable gate array (FPGA), also provides a number of significant additional features over the existing non-real-time experimental receiver, such as real-time bit error rate (BER) statistics collection and display, and recovery and display of embedded textual information. It also exhibits an indefinite run time and statistics collection. (c) 2009 Society of Photo-Optical Instrumentation Engineers

    Inductive and Transductive Few-Shot Video Classification via Appearance and Temporal Alignments

    Full text link
    We present a novel method for few-shot video classification, which performs appearance and temporal alignments. In particular, given a pair of query and support videos, we conduct appearance alignment via frame-level feature matching to achieve the appearance similarity score between the videos, while utilizing temporal order-preserving priors for obtaining the temporal similarity score between the videos. Moreover, we introduce a few-shot video classification framework that leverages the above appearance and temporal similarity scores across multiple steps, namely prototype-based training and testing as well as inductive and transductive prototype refinement. To the best of our knowledge, our work is the first to explore transductive few-shot video classification. Extensive experiments on both Kinetics and Something-Something V2 datasets show that both appearance and temporal alignments are crucial for datasets with temporal order sensitivity such as Something-Something V2. Our approach achieves similar or better results than previous methods on both datasets. Our code is available at https://github.com/VinAIResearch/fsvc-ata.Comment: Accepted to ECCV 202

    Addressing Non-IID Problem in Federated Autonomous Driving with Contrastive Divergence Loss

    Full text link
    Federated learning has been widely applied in autonomous driving since it enables training a learning model among vehicles without sharing users' data. However, data from autonomous vehicles usually suffer from the non-independent-and-identically-distributed (non-IID) problem, which may cause negative effects on the convergence of the learning process. In this paper, we propose a new contrastive divergence loss to address the non-IID problem in autonomous driving by reducing the impact of divergence factors from transmitted models during the local learning process of each silo. We also analyze the effects of contrastive divergence in various autonomous driving scenarios, under multiple network infrastructures, and with different centralized/distributed learning schemes. Our intensive experiments on three datasets demonstrate that our proposed contrastive divergence loss further improves the performance over current state-of-the-art approaches

    Overcoming Data Limitation in Medical Visual Question Answering

    Get PDF
    Traditional approaches for Visual Question Answering (VQA) require large amount of labeled data for training. Unfortunately, such large scale data is usually not available for medical domain. In this paper, we propose a novel medical VQA framework that overcomes the labeled data limitation. The proposed framework explores the use of the unsupervised Denoising Auto-Encoder (DAE) and the supervised Meta-Learning. The advantage of DAE is to leverage the large amount of unlabeled images while the advantage of Meta-Learning is to learn meta-weights that quickly adapt to VQA problem with limited labeled data. By leveraging the advantages of these techniques, it allows the proposed framework to be efficiently trained using a small labeled training set. The experimental results show that our proposed method significantly outperforms the state-of-the-art medical VQA. The source code is available at https://github.com/aioz-ai/MICCAI19-MedVQA
    corecore