9 research outputs found
Variational Dropout and the Local Reparameterization Trick
We investigate a local reparameterizaton technique for greatly reducing the
variance of stochastic gradients for variational Bayesian inference (SGVB) of a
posterior over model parameters, while retaining parallelizability. This local
reparameterization translates uncertainty about global parameters into local
noise that is independent across datapoints in the minibatch. Such
parameterizations can be trivially parallelized and have variance that is
inversely proportional to the minibatch size, generally leading to much faster
convergence. Additionally, we explore a connection with dropout: Gaussian
dropout objectives correspond to SGVB with local reparameterization, a
scale-invariant prior and proportionally fixed posterior variance. Our method
allows inference of more flexibly parameterized posteriors; specifically, we
propose variational dropout, a generalization of Gaussian dropout where the
dropout rates are learned, often leading to better models. The method is
demonstrated through several experiments
Analyzing and Improving Generative Adversarial Training for Generative Modeling and Out-of-Distribution Detection
Generative adversarial training (GAT) is a recently introduced adversarial
defense method. Previous works have focused on empirical evaluations of its
application to training robust predictive models. In this paper we focus on
theoretical understanding of the GAT method and extending its application to
generative modeling and out-of-distribution detection. We analyze the optimal
solutions of the maximin formulation employed by the GAT objective, and make a
comparative analysis of the minimax formulation employed by GANs. We use
theoretical analysis and 2D simulations to understand the convergence property
of the training algorithm. Based on these results, we develop an incremental
generative training algorithm, and conduct comprehensive evaluations of the
algorithm's application to image generation and adversarial out-of-distribution
detection. Our results suggest that generative adversarial training is a
promising new direction for the above applications