Search CORE

582 research outputs found

Deep Variational Transfer: Transfer Learning through Semi-supervised Deep Generative Models

Author: Belhaj Marouan
Pan Weiwei
Protopapas Pavlos
Publication venue
Publication date: 07/12/2018
Field of study

In real-world applications, it is often expensive and time-consuming to obtain labeled examples. In such cases, knowledge transfer from related domains, where labels are abundant, could greatly reduce the need for extensive labeling efforts. In this scenario, transfer learning comes in hand. In this paper, we propose Deep Variational Transfer (DVT), a variational autoencoder that transfers knowledge across domains using a shared latent Gaussian mixture model. Thanks to the combination of a semi-supervised ELBO and parameters sharing across domains, we are able to simultaneously: (i) align all supervised examples of the same class into the same latent Gaussian Mixture component, independently from their domain; (ii) predict the class of unsupervised examples from different domains and use them to better model the occurring shifts. We perform tests on MNIST and USPS digits datasets, showing DVT's ability to perform transfer learning across heterogeneous datasets. Additionally, we present DVT's top classification performances on the MNIST semi-supervised learning challenge. We further validate DVT on a astronomical datasets. DVT achieves states-of-the-art classification performances, transferring knowledge across real stars surveys datasets, EROS, MACHO and HiTS, . In the worst performance, we double the achieved F1-score for rare classes. These experiments show DVT's ability to tackle all major challenges posed by transfer learning: different covariate distributions, different and highly imbalanced class distributions and different feature spaces

arXiv.org e-Print Archive

An Empirical Comparison of Sampling Quality Metrics: A Case Study for Bayesian Nonnegative Matrix Factorization

Author: Doshi-Velez Finale
Masood Arjumand
Pan Weiwei
Publication venue
Publication date: 20/06/2016
Field of study

In this work, we empirically explore the question: how can we assess the quality of samples from some target distribution? We assume that the samples are provided by some valid Monte Carlo procedure, so we are guaranteed that the collection of samples will asymptotically approximate the true distribution. Most current evaluation approaches focus on two questions: (1) Has the chain mixed, that is, is it sampling from the distribution? and (2) How independent are the samples (as MCMC procedures produce correlated samples)? Focusing on the case of Bayesian nonnegative matrix factorization, we empirically evaluate standard metrics of sampler quality as well as propose new metrics to capture aspects that these measures fail to expose. The aspect of sampling that is of particular interest to us is the ability (or inability) of sampling methods to move between multiple optima in NMF problems. As a proxy, we propose and study a number of metrics that might quantify the diversity of a set of NMF factorizations obtained by a sampler through quantifying the coverage of the posterior distribution. We compare the performance of a number of standard sampling methods for NMF in terms of these new metrics

arXiv.org e-Print Archive

Weighted Tensor Decomposition for Learning Latent Variables with Partial Data

Author: Doshi-Velez Finale
Gottesman Omer
Pan Weiwei
Publication venue
Publication date: 18/10/2017
Field of study

Tensor decomposition methods are popular tools for learning latent variables given only lower-order moments of the data. However, the standard assumption is that we have sufficient data to estimate these moments to high accuracy. In this work, we consider the case in which certain dimensions of the data are not always observed---common in applied settings, where not all measurements may be taken for all observations---resulting in moment estimates of varying quality. We derive a weighted tensor decomposition approach that is computationally as efficient as the non-weighted approach, and demonstrate that it outperforms methods that do not appropriately leverage these less-observed dimensions

arXiv.org e-Print Archive

A general method for regularizing tensor decomposition methods via pseudo-data

Author: Doshi-Velez Finale
Gottesman Omer
Pan Weiwei
Publication venue
Publication date: 24/05/2019
Field of study

Tensor decomposition methods allow us to learn the parameters of latent variable models through decomposition of low-order moments of data. A significant limitation of these algorithms is that there exists no general method to regularize them, and in the past regularization has mostly been performed using bespoke modifications to the algorithms, tailored for the particular form of the desired regularizer. We present a general method of regularizing tensor decomposition methods which can be used for any likelihood model that is learnable using tensor decomposition methods and any differentiable regularization function by supplementing the training data with pseudo-data. The pseudo-data is optimized to balance two terms: being as close as possible to the true data and enforcing the desired regularization. On synthetic, semi-synthetic and real data, we demonstrate that our method can improve inference accuracy and regularize for a broad range of goals including transfer learning, sparsity, interpretability, and orthogonality of the learned parameters

arXiv.org e-Print Archive

Learning Qualitatively Diverse and Interpretable Rules for Classification

Author: Doshi-Velez Finale
Pan Weiwei
Ross Andrew Slavin
Publication venue
Publication date: 19/07/2018
Field of study

There has been growing interest in developing accurate models that can also be explained to humans. Unfortunately, if there exist multiple distinct but accurate models for some dataset, current machine learning methods are unlikely to find them: standard techniques will likely recover a complex model that combines them. In this work, we introduce a way to identify a maximal set of distinct but accurate models for a dataset. We demonstrate empirically that, in situations where the data supports multiple accurate classifiers, we tend to recover simpler, more interpretable classifiers rather than more complex ones.Comment: Presented at 2018 ICML Workshop on Human Interpretability in Machine Learning (WHI 2018), Stockholm, Sweden (revision fixes minor issues

arXiv.org e-Print Archive

Analyzing the Utility of a Support Pin in Sequential Robotic Manipulation

Author: Cao Chao
Harada Kensuke
Pan Jia
Wan Weiwei
Publication venue
Publication date: 22/12/2015
Field of study

Pick-and-place regrasp is an important manipulation skill for a robot. It helps a robot accomplish tasks that cannot be achieved within a single grasp, due to constraints such as kinematics or collisions between the robot and the environment. Previous work on pick-and-place regrasp only leveraged flat surfaces for intermediate placements, and thus is limited in the capability to reorient an object. In this paper, we extend the reorientation capability of a pick-and-place regrasp by adding a vertical pin on the working surface and using it as the intermediate location for regrasping. In particular, our method automatically computes the stable placements of an object leaning against a vertical pin, finds several force-closure grasps, generates a graph of regrasp actions, and searches for the regrasp sequence. To compare the regrasping performance with and without using pins, we evaluate the success rate and the length of regrasp sequences while performing tasks on various models. Experiments on reorientation and assembly tasks validate the benefit of using support pins for regrasping.Comment: 14 pages, 20 figure

arXiv.org e-Print Archive

Microstructure and magnetic anisotropy of electrospun Cu $_{1-x}$ Zn $_x$ Fe $_2$ O $_4$ nanofibers: A local probe study

Author: Li Zhiwei
Pan Weiwei
Yi Haibo
Zhang Junli
Publication venue: 'IOP Publishing'
Publication date: 08/10/2011
Field of study

Understanding the phenomena at the nanometer scale is of fundamental importance for future improvements of desired properties of nanomaterials. We report a detailed investigation of the microstructure and the resulting magnetic anisotropy by magnetic, transmission electron microscope (TEM) and M\"ossbauer measurements of the electrospun Cu

_{1-x}

_x

_2

_4

nanofibers. Our results show that the electrospun Cu

_{1-x}

_x

_2

_4

nanofibers exhibit nearly isotropic magnetic anisotropy. TEM measurements indicate that the nanofibers are composed of loosely connected and randomly aligned nanograins. As revealed by the Henkel plot, these nanofibers and the nanograins within the nanofibers are dipolar coupled, which reduces the effective shape anisotropy leading to a nearly random configuration of the magnetic moments inside the nanofibers, hence, the observed nearly isotropic magnetic anisotropy can be easily understood.Comment: 5 pages, 5 Figures to be published in J. Phys. D: Appl. Phy

arXiv.org e-Print Archive

Deep Learning Scooping Motion using Bilateral Teleoperations

Author: Harada Kensuke
Ochi Hitoe
Pan Jia
Wan Weiwei
Yamanobe Natsuki
Yang Yajue
Publication venue
Publication date: 24/10/2018
Field of study

We present bilateral teleoperation system for task learning and robot motion generation. Our system includes a bilateral teleoperation platform and a deep learning software. The deep learning software refers to human demonstration using the bilateral teleoperation platform to collect visual images and robotic encoder values. It leverages the datasets of images and robotic encoder information to learn about the inter-modal correspondence between visual images and robot motion. In detail, the deep learning software uses a combination of Deep Convolutional Auto-Encoders (DCAE) over image regions, and Recurrent Neural Network with Long Short-Term Memory units (LSTM-RNN) over robot motor angles, to learn motion taught be human teleoperation. The learnt models are used to predict new motion trajectories for similar tasks. Experimental results show that our system has the adaptivity to generate motion for similar scooping tasks. Detailed analysis is performed based on failure cases of the experimental results. Some insights about the cans and cannots of the system are summarized

arXiv.org e-Print Archive

Failure Modes of Variational Autoencoders and Their Effects on Downstream Tasks

Author: Doshi-Velez Finale
Pan Weiwei
Yacoby Yaniv
Publication venue
Publication date: 25/02/2021
Field of study

Variational Auto-encoders (VAEs) are deep generative latent variable models that are widely used for a number of downstream tasks. While it has been demonstrated that VAE training can suffer from a number of pathologies, existing literature lacks characterizations of exactly when these pathologies occur and how they impact down-stream task performance. In this paper we concretely characterize conditions under which VAE training exhibits pathologies and connect these failure modes to undesirable effects on specific downstream tasks, such as learning compressed and disentangled representations, adversarial robustness and semi-supervised learning.Comment: Accepted at the International Conference on Machine Learning (ICML) Workshop on Uncertainty and Robustness in Deep Learning (UDL) 202

arXiv.org e-Print Archive

Characterizing and Avoiding Problematic Global Optima of Variational Autoencoders

Author: Doshi-Velez Finale
Pan Weiwei
Yacoby Yaniv
Publication venue
Publication date: 17/03/2020
Field of study

Variational Auto-encoders (VAEs) are deep generative latent variable models consisting of two components: a generative model that captures a data distribution p(x) by transforming a distribution p(z) over latent space, and an inference model that infers likely latent codes for each data point (Kingma and Welling, 2013). Recent work shows that traditional training methods tend to yield solutions that violate modeling desiderata: (1) the learned generative model captures the observed data distribution but does so while ignoring the latent codes, resulting in codes that do not represent the data (e.g. van den Oord et al. (2017); Kim et al. (2018)); (2) the aggregate of the learned latent codes does not match the prior p(z). This mismatch means that the learned generative model will be unable to generate realistic data with samples from p(z)(e.g. Makhzani et al. (2015); Tomczak and Welling (2017)). In this paper, we demonstrate that both issues stem from the fact that the global optima of the VAE training objective often correspond to undesirable solutions. Our analysis builds on two observations: (1) the generative model is unidentifiable - there exist many generative models that explain the data equally well, each with different (and potentially unwanted) properties and (2) bias in the VAE objective - the VAE objective may prefer generative models that explain the data poorly but have posteriors that are easy to approximate. We present a novel inference method, LiBI, mitigating the problems identified in our analysis. On synthetic datasets, we show that LiBI can learn generative models that capture the data distribution and inference models that better satisfy modeling assumptions when traditional methods struggle to do so.Comment: Accepted at the Proceedings of The 2nd Symposium on Advances in Approximate Bayesian Inference 201

arXiv.org e-Print Archive