283 research outputs found
Deep Metric Learning Meets Deep Clustering: An Novel Unsupervised Approach for Feature Embedding
Unsupervised Deep Distance Metric Learning (UDML) aims to learn sample
similarities in the embedding space from an unlabeled dataset. Traditional UDML
methods usually use the triplet loss or pairwise loss which requires the mining
of positive and negative samples w.r.t. anchor data points. This is, however,
challenging in an unsupervised setting as the label information is not
available. In this paper, we propose a new UDML method that overcomes that
challenge. In particular, we propose to use a deep clustering loss to learn
centroids, i.e., pseudo labels, that represent semantic classes. During
learning, these centroids are also used to reconstruct the input samples. It
hence ensures the representativeness of centroids - each centroid represents
visually similar samples. Therefore, the centroids give information about
positive (visually similar) and negative (visually dissimilar) samples. Based
on pseudo labels, we propose a novel unsupervised metric loss which enforces
the positive concentration and negative separation of samples in the embedding
space. Experimental results on benchmarking datasets show that the proposed
approach outperforms other UDML methods.Comment: Accepted in BMVC 202
Application of Self-Supervised Learning to MICA Model for Reconstructing Imperfect 3D Facial Structures
In this study, we emphasize the integration of a pre-trained MICA model with
an imperfect face dataset, employing a self-supervised learning approach. We
present an innovative method for regenerating flawed facial structures,
yielding 3D printable outputs that effectively support physicians in their
patient treatment process. Our results highlight the model's capacity for
concealing scars and achieving comprehensive facial reconstructions without
discernible scarring. By capitalizing on pre-trained models and necessitating
only a few hours of supplementary training, our methodology adeptly devises an
optimal model for reconstructing damaged and imperfect facial features.
Harnessing contemporary 3D printing technology, we institute a standardized
protocol for fabricating realistic, camouflaging mask models for patients in a
laboratory environment
Glenn Research Center Quantum Communicator Receiver Design and Development
We investigate, design, and develop a prototype real-time synchronous receiver for the second-generation quantum communicator recently developed at the National Aeronautics and Space Administration (NASA) Glenn Research Center. This communication system exploits the temporal coincidences between simultaneously fired low-power laser sources to communicate at power levels several orders of magnitude less than what is currently achievable through classical means, with the ultimate goal of creating ultra-low-power microsize optical communications and sensing devices. The proposed receiver uses a unique adaptation of the early-late gate method for symbol synchronization and a newly identified 31-bit synchronization word for frame synchronization. This receiver, implemented in a field-programmable gate array (FPGA), also provides a number of significant additional features over the existing non-real-time experimental receiver, such as real-time bit error rate (BER) statistics collection and display, and recovery and display of embedded textual information. It also exhibits an indefinite run time and statistics collection. (c) 2009 Society of Photo-Optical Instrumentation Engineers
Glenn Research Center Quantum Communicator Receiver Design and Development
We investigate, design, and develop a prototype real-time synchronous receiver for the second-generation quantum communicator recently developed at the National Aeronautics and Space Administration (NASA) Glenn Research Center. This communication system exploits the temporal coincidences between simultaneously fired low-power laser sources to communicate at power levels several orders of magnitude less than what is currently achievable through classical means, with the ultimate goal of creating ultra-low-power microsize optical communications and sensing devices. The proposed receiver uses a unique adaptation of the early-late gate method for symbol synchronization and a newly identified 31-bit synchronization word for frame synchronization. This receiver, implemented in a field-programmable gate array (FPGA), also provides a number of significant additional features over the existing non-real-time experimental receiver, such as real-time bit error rate (BER) statistics collection and display, and recovery and display of embedded textual information. It also exhibits an indefinite run time and statistics collection. (c) 2009 Society of Photo-Optical Instrumentation Engineers
Inductive and Transductive Few-Shot Video Classification via Appearance and Temporal Alignments
We present a novel method for few-shot video classification, which performs
appearance and temporal alignments. In particular, given a pair of query and
support videos, we conduct appearance alignment via frame-level feature
matching to achieve the appearance similarity score between the videos, while
utilizing temporal order-preserving priors for obtaining the temporal
similarity score between the videos. Moreover, we introduce a few-shot video
classification framework that leverages the above appearance and temporal
similarity scores across multiple steps, namely prototype-based training and
testing as well as inductive and transductive prototype refinement. To the best
of our knowledge, our work is the first to explore transductive few-shot video
classification. Extensive experiments on both Kinetics and Something-Something
V2 datasets show that both appearance and temporal alignments are crucial for
datasets with temporal order sensitivity such as Something-Something V2. Our
approach achieves similar or better results than previous methods on both
datasets. Our code is available at https://github.com/VinAIResearch/fsvc-ata.Comment: Accepted to ECCV 202
Addressing Non-IID Problem in Federated Autonomous Driving with Contrastive Divergence Loss
Federated learning has been widely applied in autonomous driving since it
enables training a learning model among vehicles without sharing users' data.
However, data from autonomous vehicles usually suffer from the
non-independent-and-identically-distributed (non-IID) problem, which may cause
negative effects on the convergence of the learning process. In this paper, we
propose a new contrastive divergence loss to address the non-IID problem in
autonomous driving by reducing the impact of divergence factors from
transmitted models during the local learning process of each silo. We also
analyze the effects of contrastive divergence in various autonomous driving
scenarios, under multiple network infrastructures, and with different
centralized/distributed learning schemes. Our intensive experiments on three
datasets demonstrate that our proposed contrastive divergence loss further
improves the performance over current state-of-the-art approaches
Overcoming Data Limitation in Medical Visual Question Answering
Traditional approaches for Visual Question Answering (VQA) require large amount of labeled data for training. Unfortunately, such large scale data is usually not available for medical domain. In this paper, we propose a novel medical VQA framework that overcomes the labeled data limitation. The proposed framework explores the use of the unsupervised Denoising Auto-Encoder (DAE) and the supervised Meta-Learning. The advantage of DAE is to leverage the large amount of unlabeled images while the advantage of Meta-Learning is to learn meta-weights that quickly adapt to VQA problem with limited labeled data. By leveraging the advantages of these techniques, it allows the proposed framework to be efficiently trained using a small labeled training set. The experimental results show that our proposed method significantly outperforms the state-of-the-art medical VQA. The source code is available at https://github.com/aioz-ai/MICCAI19-MedVQA
- …