8 research outputs found
Semi-supervised on-device neural network adaptation for remote and portable laser-induced breakdown spectroscopy
Laser-induced breakdown spectroscopy (LIBS) is a popular, fast elemental
analysis technique used to determine the chemical composition of target
samples, such as in industrial analysis of metals or in space exploration.
Recently, there has been a rise in the use of machine learning (ML) techniques
for LIBS data processing. However, ML for LIBS is challenging as: (i) the
predictive models must be lightweight since they need to be deployed in highly
resource-constrained and battery-operated portable LIBS systems; and (ii) since
these systems can be remote, the models must be able to self-adapt to any
domain shift in input distributions which could be due to the lack of different
types of inputs in training data or dynamic environmental/sensor noise. This
on-device retraining of model should not only be fast but also unsupervised due
to the absence of new labeled data in remote LIBS systems. We introduce a
lightweight multi-layer perceptron (MLP) model for LIBS that can be adapted
on-device without requiring labels for new input data. It shows 89.3% average
accuracy during data streaming, and up to 2.1% better accuracy compared to an
MLP model that does not support adaptation. Finally, we also characterize the
inference and retraining performance of our model on Google Pixel2 phone.Comment: Accepted in On-Device Intelligence Workshop (held in conjunction with
MLSys Conference), 202
Curriculum Manager for Source Selection in Multi-Source Domain Adaptation
The performance of Multi-Source Unsupervised Domain Adaptation depends
significantly on the effectiveness of transfer from labeled source domain
samples. In this paper, we proposed an adversarial agent that learns a dynamic
curriculum for source samples, called Curriculum Manager for Source Selection
(CMSS). The Curriculum Manager, an independent network module, constantly
updates the curriculum during training, and iteratively learns which domains or
samples are best suited for aligning to the target. The intuition behind this
is to force the Curriculum Manager to constantly re-measure the transferability
of latent domains over time to adversarially raise the error rate of the domain
discriminator. CMSS does not require any knowledge of the domain labels, yet it
outperforms other methods on four well-known benchmarks by significant margins.
We also provide interpretable results that shed light on the proposed method
Unsupervised Calibration under Covariate Shift
A probabilistic model is said to be calibrated if its predicted probabilities
match the corresponding empirical frequencies. Calibration is important for
uncertainty quantification and decision making in safety-critical applications.
While calibration of classifiers has been widely studied, we find that
calibration is brittle and can be easily lost under minimal covariate shifts.
Existing techniques, including domain adaptation ones, primarily focus on
prediction accuracy and do not guarantee calibration neither in theory nor in
practice. In this work, we formally introduce the problem of calibration under
domain shift, and propose an importance sampling based approach to address it.
We evaluate and discuss the efficacy of our method on both real-world datasets
and synthetic datasets.Comment: Submitted to Conference on Uncertainty in Artificial Intelligence
(UAI 2020
Multi-Source Unsupervised Hyperparameter Optimization
How can we conduct efficient hyperparameter optimization for a completely new
task? In this work, we consider a novel setting, where we search for the
optimal hyperparameters for a target task of interest using only unlabeled
target task and somewhat relevant source task datasets. In this setting, it is
essential to estimate the ground-truth target task objective using only the
available information. We propose estimators to unbiasedly approximate the
ground-truth with a desirable variance property. Building on these estimators,
we provide a general and tractable hyperparameter optimization procedure for
our setting. The experimental evaluations demonstrate that the proposed
framework broadens the applications of automated hyperparameter optimization.Comment: equal contributio
Cross-regional oil palm tree counting and detection via multi-level attention domain adaptation network
Providing an accurate evaluation of palm tree plantation in a large region
can bring meaningful impacts in both economic and ecological aspects. However,
the enormous spatial scale and the variety of geological features across
regions has made it a grand challenge with limited solutions based on manual
human monitoring efforts. Although deep learning based algorithms have
demonstrated potential in forming an automated approach in recent years, the
labelling efforts needed for covering different features in different regions
largely constrain its effectiveness in large-scale problems. In this paper, we
propose a novel domain adaptive oil palm tree detection method, i.e., a
Multi-level Attention Domain Adaptation Network (MADAN) to reap cross-regional
oil palm tree counting and detection. MADAN consists of 4 procedures: First, we
adopted a batch-instance normalization network (BIN) based feature extractor
for improving the generalization ability of the model, integrating batch
normalization and instance normalization. Second, we embedded a multi-level
attention mechanism (MLA) into our architecture for enhancing the
transferability, including a feature level attention and an entropy level
attention. Then we designed a minimum entropy regularization (MER) to increase
the confidence of the classifier predictions through assigning the entropy
level attention value to the entropy penalty. Finally, we employed a sliding
window-based prediction and an IOU based post-processing approach to attain the
final detection results. We conducted comprehensive ablation experiments using
three different satellite images of large-scale oil palm plantation area with
six transfer tasks. MADAN improves the detection accuracy by 14.98% in terms of
average F1-score compared with the Baseline method (without DA), and performs
3.55%-14.49% better than existing domain adaptation methods.Comment: 39 pages, 13 figures, accepted by ISPRS PG&R
Discriminative Feature Alignment: Improving Transferability of Unsupervised Domain Adaptation by Gaussian-guided Latent Alignment
In this study, we focus on the unsupervised domain adaptation problem where
an approximate inference model is to be learned from a labeled data domain and
expected to generalize well to an unlabeled data domain. The success of
unsupervised domain adaptation largely relies on the cross-domain feature
alignment. Previous work has attempted to directly align latent features by the
classifier-induced discrepancies. Nevertheless, a common feature space cannot
always be learned via this direct feature alignment especially when a large
domain gap exists. To solve this problem, we introduce a Gaussian-guided latent
alignment approach to align the latent feature distributions of the two domains
under the guidance of the prior distribution. In such an indirect way, the
distributions over the samples from the two domains will be constructed on a
common feature space, i.e., the space of the prior, which promotes better
feature alignment. To effectively align the target latent distribution with
this prior distribution, we also propose a novel unpaired L1-distance by taking
advantage of the formulation of the encoder-decoder. The extensive evaluations
on nine benchmark datasets validate the superior knowledge transferability
through outperforming state-of-the-art methods and the versatility of the
proposed method by improving the existing work significantly.Comment: 14 pages, 11 figure
Learning to Match Distributions for Domain Adaptation
When the training and test data are from different distributions, domain
adaptation is needed to reduce dataset bias to improve the model's
generalization ability. Since it is difficult to directly match the
cross-domain joint distributions, existing methods tend to reduce the marginal
or conditional distribution divergence using predefined distances such as MMD
and adversarial-based discrepancies. However, it remains challenging to
determine which method is suitable for a given application since they are built
with certain priors or bias. Thus they may fail to uncover the underlying
relationship between transferable features and joint distributions. This paper
proposes Learning to Match (L2M) to automatically learn the cross-domain
distribution matching without relying on hand-crafted priors on the matching
loss. Instead, L2M reduces the inductive bias by using a meta-network to learn
the distribution matching loss in a data-driven way. L2M is a general framework
that unifies task-independent and human-designed matching features. We design a
novel optimization algorithm for this challenging objective with
self-supervised label propagation. Experiments on public datasets substantiate
the superiority of L2M over SOTA methods. Moreover, we apply L2M to transfer
from pneumonia to COVID-19 chest X-ray images with remarkable performance. L2M
can also be extended in other distribution matching applications where we show
in a trial experiment that L2M generates more realistic and sharper MNIST
samples.Comment: Preprint. 20 Pages. Code available at
https://github.com/jindongwang/transferlearning/tree/master/code/deep/Learning-to-Matc
Selecting Treatment Effects Models for Domain Adaptation Using Causal Knowledge
Selecting causal inference models for estimating individualized treatment
effects (ITE) from observational data presents a unique challenge since the
counterfactual outcomes are never observed. The problem is challenged further
in the unsupervised domain adaptation (UDA) setting where we only have access
to labeled samples in the source domain, but desire selecting a model that
achieves good performance on a target domain for which only unlabeled samples
are available. Existing techniques for UDA model selection are designed for the
predictive setting. These methods examine discriminative density ratios between
the input covariates in the source and target domain and do not factor in the
model's predictions in the target domain. Because of this, two models with
identical performance on the source domain would receive the same risk score by
existing methods, but in reality, have significantly different performance in
the test domain. We leverage the invariance of causal structures across domains
to propose a novel model selection metric specifically designed for ITE methods
under the UDA setting. In particular, we propose selecting models whose
predictions of interventions' effects satisfy known causal structures in the
target domain. Experimentally, our method selects ITE models that are more
robust to covariate shifts on several healthcare datasets, including estimating
the effect of ventilation in COVID-19 patients from different geographic
locations