582 research outputs found
Deep Variational Transfer: Transfer Learning through Semi-supervised Deep Generative Models
In real-world applications, it is often expensive and time-consuming to
obtain labeled examples. In such cases, knowledge transfer from related
domains, where labels are abundant, could greatly reduce the need for extensive
labeling efforts. In this scenario, transfer learning comes in hand. In this
paper, we propose Deep Variational Transfer (DVT), a variational autoencoder
that transfers knowledge across domains using a shared latent Gaussian mixture
model. Thanks to the combination of a semi-supervised ELBO and parameters
sharing across domains, we are able to simultaneously: (i) align all supervised
examples of the same class into the same latent Gaussian Mixture component,
independently from their domain; (ii) predict the class of unsupervised
examples from different domains and use them to better model the occurring
shifts. We perform tests on MNIST and USPS digits datasets, showing DVT's
ability to perform transfer learning across heterogeneous datasets.
Additionally, we present DVT's top classification performances on the MNIST
semi-supervised learning challenge. We further validate DVT on a astronomical
datasets. DVT achieves states-of-the-art classification performances,
transferring knowledge across real stars surveys datasets, EROS, MACHO and
HiTS, . In the worst performance, we double the achieved F1-score for rare
classes. These experiments show DVT's ability to tackle all major challenges
posed by transfer learning: different covariate distributions, different and
highly imbalanced class distributions and different feature spaces
An Empirical Comparison of Sampling Quality Metrics: A Case Study for Bayesian Nonnegative Matrix Factorization
In this work, we empirically explore the question: how can we assess the
quality of samples from some target distribution? We assume that the samples
are provided by some valid Monte Carlo procedure, so we are guaranteed that the
collection of samples will asymptotically approximate the true distribution.
Most current evaluation approaches focus on two questions: (1) Has the chain
mixed, that is, is it sampling from the distribution? and (2) How independent
are the samples (as MCMC procedures produce correlated samples)? Focusing on
the case of Bayesian nonnegative matrix factorization, we empirically evaluate
standard metrics of sampler quality as well as propose new metrics to capture
aspects that these measures fail to expose. The aspect of sampling that is of
particular interest to us is the ability (or inability) of sampling methods to
move between multiple optima in NMF problems. As a proxy, we propose and study
a number of metrics that might quantify the diversity of a set of NMF
factorizations obtained by a sampler through quantifying the coverage of the
posterior distribution. We compare the performance of a number of standard
sampling methods for NMF in terms of these new metrics
Weighted Tensor Decomposition for Learning Latent Variables with Partial Data
Tensor decomposition methods are popular tools for learning latent variables
given only lower-order moments of the data. However, the standard assumption is
that we have sufficient data to estimate these moments to high accuracy. In
this work, we consider the case in which certain dimensions of the data are not
always observed---common in applied settings, where not all measurements may be
taken for all observations---resulting in moment estimates of varying quality.
We derive a weighted tensor decomposition approach that is computationally as
efficient as the non-weighted approach, and demonstrate that it outperforms
methods that do not appropriately leverage these less-observed dimensions
A general method for regularizing tensor decomposition methods via pseudo-data
Tensor decomposition methods allow us to learn the parameters of latent
variable models through decomposition of low-order moments of data. A
significant limitation of these algorithms is that there exists no general
method to regularize them, and in the past regularization has mostly been
performed using bespoke modifications to the algorithms, tailored for the
particular form of the desired regularizer. We present a general method of
regularizing tensor decomposition methods which can be used for any likelihood
model that is learnable using tensor decomposition methods and any
differentiable regularization function by supplementing the training data with
pseudo-data. The pseudo-data is optimized to balance two terms: being as close
as possible to the true data and enforcing the desired regularization. On
synthetic, semi-synthetic and real data, we demonstrate that our method can
improve inference accuracy and regularize for a broad range of goals including
transfer learning, sparsity, interpretability, and orthogonality of the learned
parameters
Learning Qualitatively Diverse and Interpretable Rules for Classification
There has been growing interest in developing accurate models that can also
be explained to humans. Unfortunately, if there exist multiple distinct but
accurate models for some dataset, current machine learning methods are unlikely
to find them: standard techniques will likely recover a complex model that
combines them. In this work, we introduce a way to identify a maximal set of
distinct but accurate models for a dataset. We demonstrate empirically that, in
situations where the data supports multiple accurate classifiers, we tend to
recover simpler, more interpretable classifiers rather than more complex ones.Comment: Presented at 2018 ICML Workshop on Human Interpretability in Machine
Learning (WHI 2018), Stockholm, Sweden (revision fixes minor issues
Analyzing the Utility of a Support Pin in Sequential Robotic Manipulation
Pick-and-place regrasp is an important manipulation skill for a robot. It
helps a robot accomplish tasks that cannot be achieved within a single grasp,
due to constraints such as kinematics or collisions between the robot and the
environment. Previous work on pick-and-place regrasp only leveraged flat
surfaces for intermediate placements, and thus is limited in the capability to
reorient an object.
In this paper, we extend the reorientation capability of a pick-and-place
regrasp by adding a vertical pin on the working surface and using it as the
intermediate location for regrasping. In particular, our method automatically
computes the stable placements of an object leaning against a vertical pin,
finds several force-closure grasps, generates a graph of regrasp actions, and
searches for the regrasp sequence. To compare the regrasping performance with
and without using pins, we evaluate the success rate and the length of regrasp
sequences while performing tasks on various models. Experiments on
reorientation and assembly tasks validate the benefit of using support pins for
regrasping.Comment: 14 pages, 20 figure
Microstructure and magnetic anisotropy of electrospun CuZnFeO nanofibers: A local probe study
Understanding the phenomena at the nanometer scale is of fundamental
importance for future improvements of desired properties of nanomaterials. We
report a detailed investigation of the microstructure and the resulting
magnetic anisotropy by magnetic, transmission electron microscope (TEM) and
M\"ossbauer measurements of the electrospun CuZnFeO
nanofibers. Our results show that the electrospun CuZnFeO
nanofibers exhibit nearly isotropic magnetic anisotropy. TEM measurements
indicate that the nanofibers are composed of loosely connected and randomly
aligned nanograins. As revealed by the Henkel plot, these nanofibers and the
nanograins within the nanofibers are dipolar coupled, which reduces the
effective shape anisotropy leading to a nearly random configuration of the
magnetic moments inside the nanofibers, hence, the observed nearly isotropic
magnetic anisotropy can be easily understood.Comment: 5 pages, 5 Figures to be published in J. Phys. D: Appl. Phy
Deep Learning Scooping Motion using Bilateral Teleoperations
We present bilateral teleoperation system for task learning and robot motion
generation. Our system includes a bilateral teleoperation platform and a deep
learning software. The deep learning software refers to human demonstration
using the bilateral teleoperation platform to collect visual images and robotic
encoder values. It leverages the datasets of images and robotic encoder
information to learn about the inter-modal correspondence between visual images
and robot motion. In detail, the deep learning software uses a combination of
Deep Convolutional Auto-Encoders (DCAE) over image regions, and Recurrent
Neural Network with Long Short-Term Memory units (LSTM-RNN) over robot motor
angles, to learn motion taught be human teleoperation. The learnt models are
used to predict new motion trajectories for similar tasks. Experimental results
show that our system has the adaptivity to generate motion for similar scooping
tasks. Detailed analysis is performed based on failure cases of the
experimental results. Some insights about the cans and cannots of the system
are summarized
Failure Modes of Variational Autoencoders and Their Effects on Downstream Tasks
Variational Auto-encoders (VAEs) are deep generative latent variable models
that are widely used for a number of downstream tasks. While it has been
demonstrated that VAE training can suffer from a number of pathologies,
existing literature lacks characterizations of exactly when these pathologies
occur and how they impact down-stream task performance. In this paper we
concretely characterize conditions under which VAE training exhibits
pathologies and connect these failure modes to undesirable effects on specific
downstream tasks, such as learning compressed and disentangled representations,
adversarial robustness and semi-supervised learning.Comment: Accepted at the International Conference on Machine Learning (ICML)
Workshop on Uncertainty and Robustness in Deep Learning (UDL) 202
Characterizing and Avoiding Problematic Global Optima of Variational Autoencoders
Variational Auto-encoders (VAEs) are deep generative latent variable models
consisting of two components: a generative model that captures a data
distribution p(x) by transforming a distribution p(z) over latent space, and an
inference model that infers likely latent codes for each data point (Kingma and
Welling, 2013). Recent work shows that traditional training methods tend to
yield solutions that violate modeling desiderata: (1) the learned generative
model captures the observed data distribution but does so while ignoring the
latent codes, resulting in codes that do not represent the data (e.g. van den
Oord et al. (2017); Kim et al. (2018)); (2) the aggregate of the learned latent
codes does not match the prior p(z). This mismatch means that the learned
generative model will be unable to generate realistic data with samples from
p(z)(e.g. Makhzani et al. (2015); Tomczak and Welling (2017)). In this paper,
we demonstrate that both issues stem from the fact that the global optima of
the VAE training objective often correspond to undesirable solutions. Our
analysis builds on two observations: (1) the generative model is unidentifiable
- there exist many generative models that explain the data equally well, each
with different (and potentially unwanted) properties and (2) bias in the VAE
objective - the VAE objective may prefer generative models that explain the
data poorly but have posteriors that are easy to approximate. We present a
novel inference method, LiBI, mitigating the problems identified in our
analysis. On synthetic datasets, we show that LiBI can learn generative models
that capture the data distribution and inference models that better satisfy
modeling assumptions when traditional methods struggle to do so.Comment: Accepted at the Proceedings of The 2nd Symposium on Advances in
Approximate Bayesian Inference 201
- β¦