96,907 research outputs found
Analysing acoustic model changes for active learning in automatic speech recognition
In active learning for Automatic Speech Recognition
(ASR), a portion of data is automatically selected for manual
transcription. The objective is to improve ASR performance with
retrained acoustic models. The standard approaches are based
on confidence of individual sentences. In this study, we look
into an alternative view on transcript label quality, in which
Gaussian Supervector Distance (GSD) is used as a criterion
for data selection. GSD is a metric which quantifies how the
model was changed during its adaptation. By using an automatic
speech recognition transcript derived from an out-of-domain
acoustic model, unsupervised adaptation was conducted and GSD
was computed. The adapted model is then applied to an audio
book transcription task. It is found that GSD provide hints for
predicting data transcription quality. A preliminary attempt in
active learning proves the effectiveness of GSD selection criterion
over random selection, shedding light on its prospective use
Active Multi-Kernel Domain Adaptation for Hyperspectral Image Classification
Recent years have witnessed the quick progress of the hyperspectral images
(HSI) classification. Most of existing studies either heavily rely on the
expensive label information using the supervised learning or can hardly exploit
the discriminative information borrowed from related domains. To address this
issues, in this paper we show a novel framework addressing HSI classification
based on the domain adaptation (DA) with active learning (AL). The main idea of
our method is to retrain the multi-kernel classifier by utilizing the available
labeled samples from source domain, and adding minimum number of the most
informative samples with active queries in the target domain. The proposed
method adaptively combines multiple kernels, forming a DA classifier that
minimizes the bias between the source and target domains. Further equipped with
the nested actively updating process, it sequentially expands the training set
and gradually converges to a satisfying level of classification performance. We
study this active adaptation framework with the Margin Sampling (MS) strategy
in the HSI classification task. Our experimental results on two popular HSI
datasets demonstrate its effectiveness
Self-adjustable domain adaptation in personalized ECG monitoring integrated with IR-UWB radar
To enhance electrocardiogram (ECG) monitoring systems in personalized detections, deep neural networks (DNNs) are applied to overcome individual differences by periodical retraining. As introduced previously [4], DNNs relieve individual differences by fusing ECG with impulse radio ultra-wide band (IR-UWB) radar. However, such DNN-based ECG monitoring system tends to overfit into personal small datasets and is difficult to generalize to newly collected unlabeled data. This paper proposes a self-adjustable domain adaptation (SADA) strategy to prevent from overfitting and exploit unlabeled data. Firstly, this paper enlarges the database of ECG and radar data with actual records acquired from 28 testers and expanded by the data augmentation. Secondly, to utilize unlabeled data, SADA combines self organizing maps with the transfer learning in predicting labels. Thirdly, SADA integrates the one-class classification with domain adaptation algorithms to reduce overfitting. Based on our enlarged database and standard databases, a large dataset of 73200 records and a small one of 1849 records are built up to verify our proposal. Results show SADA\u27s effectiveness in predicting labels and increments in the sensitivity of DNNs by 14.4% compared with existing domain adaptation algorithms
Learning Sampling Policies for Domain Adaptation
We address the problem of semi-supervised domain adaptation of classification
algorithms through deep Q-learning. The core idea is to consider the
predictions of a source domain network on target domain data as noisy labels,
and learn a policy to sample from this data so as to maximize classification
accuracy on a small annotated reward partition of the target domain. Our
experiments show that learned sampling policies construct labeled sets that
improve accuracies of visual classifiers over baselines
Transfer Learning in Astronomy: A New Machine-Learning Paradigm
The widespread dissemination of machine learning tools in science,
particularly in astronomy, has revealed the limitation of working with simple
single-task scenarios in which any task in need of a predictive model is looked
in isolation, and ignores the existence of other similar tasks. In contrast, a
new generation of techniques is emerging where predictive models can take
advantage of previous experience to leverage information from similar tasks.
The new emerging area is referred to as transfer learning. In this paper, I
briefly describe the motivation behind the use of transfer learning techniques,
and explain how such techniques can be used to solve popular problems in
astronomy. As an example, a prevalent problem in astronomy is to estimate the
class of an object (e.g., Supernova Ia) using a generation of photometric
light-curve datasets where data abounds, but class labels are scarce; such
analysis can benefit from spectroscopic data where class labels are known with
high confidence, but the data sample is small. Transfer learning provides a
robust and practical solution to leverage information from one domain to
improve the accuracy of a model built on a different domain. In the example
above, transfer learning would look to overcome the difficulty in the
compatibility of models between spectroscopic data and photometric data, since
data properties such as size, class priors, and underlying distributions, are
all expected to be significantly different
Sim2Real View Invariant Visual Servoing by Recurrent Control
Humans are remarkably proficient at controlling their limbs and tools from a
wide range of viewpoints and angles, even in the presence of optical
distortions. In robotics, this ability is referred to as visual servoing:
moving a tool or end-point to a desired location using primarily visual
feedback. In this paper, we study how viewpoint-invariant visual servoing
skills can be learned automatically in a robotic manipulation scenario. To this
end, we train a deep recurrent controller that can automatically determine
which actions move the end-point of a robotic arm to a desired object. The
problem that must be solved by this controller is fundamentally ambiguous:
under severe variation in viewpoint, it may be impossible to determine the
actions in a single feedforward operation. Instead, our visual servoing system
must use its memory of past movements to understand how the actions affect the
robot motion from the current viewpoint, correcting mistakes and gradually
moving closer to the target. This ability is in stark contrast to most visual
servoing methods, which either assume known dynamics or require a calibration
phase. We show how we can learn this recurrent controller using simulated data
and a reinforcement learning objective. We then describe how the resulting
model can be transferred to a real-world robot by disentangling perception from
control and only adapting the visual layers. The adapted model can servo to
previously unseen objects from novel viewpoints on a real-world Kuka IIWA
robotic arm. For supplementary videos, see:
https://fsadeghi.github.io/Sim2RealViewInvariantServoComment: Supplementary video:
https://fsadeghi.github.io/Sim2RealViewInvariantServ
Mid-Level Visual Representations Improve Generalization and Sample Efficiency for Learning Visuomotor Policies
How much does having visual priors about the world (e.g. the fact that the
world is 3D) assist in learning to perform downstream motor tasks (e.g.
delivering a package)? We study this question by integrating a generic
perceptual skill set (e.g. a distance estimator, an edge detector, etc.) within
a reinforcement learning framework--see Figure 1. This skill set (hereafter
mid-level perception) provides the policy with a more processed state of the
world compared to raw images.
We find that using a mid-level perception confers significant advantages over
training end-to-end from scratch (i.e. not leveraging priors) in
navigation-oriented tasks. Agents are able to generalize to situations where
the from-scratch approach fails and training becomes significantly more sample
efficient. However, we show that realizing these gains requires careful
selection of the mid-level perceptual skills. Therefore, we refine our findings
into an efficient max-coverage feature set that can be adopted in lieu of raw
images. We perform our study in completely separate buildings for training and
testing and compare against visually blind baseline policies and
state-of-the-art feature learning methods.Comment: See project website, demos, and code at http://perceptual.acto
Domain Adaptations for Computer Vision Applications
A basic assumption of statistical learning theory is that train and test data
are drawn from the same underlying distribution. Unfortunately, this assumption
doesn't hold in many applications. Instead, ample labeled data might exist in a
particular `source' domain while inference is needed in another, `target'
domain. Domain adaptation methods leverage labeled data from both domains to
improve classification on unseen data in the target domain. In this work we
survey domain transfer learning methods for various application domains with
focus on recent work in Computer Vision
When can Multi-Site Datasets be Pooled for Regression? Hypothesis Tests, -consistency and Neuroscience Applications
Many studies in biomedical and health sciences involve small sample sizes due
to logistic or financial constraints. Often, identifying weak (but
scientifically interesting) associations between a set of predictors and a
response necessitates pooling datasets from multiple diverse labs or groups.
While there is a rich literature in statistical machine learning to address
distributional shifts and inference in multi-site datasets, it is less clear
such pooling is guaranteed to help (and when it does not) --
independent of the inference algorithms we use. In this paper, we present a
hypothesis test to answer this question, both for classical and high
dimensional linear regression. We precisely identify regimes where pooling
datasets across multiple sites is sensible, and how such policy decisions can
be made via simple checks executable on each site before any data transfer ever
happens. With a focus on Alzheimer's disease studies, we present empirical
results showing that in regimes suggested by our analysis, pooling a local
dataset with data from an international study improves power.Comment: 34th International Conference on Machine Learnin
Image Generation From Small Datasets via Batch Statistics Adaptation
Thanks to the recent development of deep generative models, it is becoming
possible to generate high-quality images with both fidelity and diversity.
However, the training of such generative models requires a large dataset. To
reduce the amount of data required, we propose a new method for transferring
prior knowledge of the pre-trained generator, which is trained with a large
dataset, to a small dataset in a different domain. Using such prior knowledge,
the model can generate images leveraging some common sense that cannot be
acquired from a small dataset. In this work, we propose a novel method focusing
on the parameters for batch statistics, scale and shift, of the hidden layers
in the generator. By training only these parameters in a supervised manner, we
achieved stable training of the generator, and our method can generate higher
quality images compared to previous methods without collapsing, even when the
dataset is small (~100). Our results show that the diversity of the filters
acquired in the pre-trained generator is important for the performance on the
target domain. Our method makes it possible to add a new class or domain to a
pre-trained generator without disturbing the performance on the original
domain.Comment: ICCV 201
- …