3 research outputs found
Improved Gradient-Based Optimization Over Discrete Distributions
In many applications we seek to maximize an expectation with respect to a
distribution over discrete variables. Estimating gradients of such objectives
with respect to the distribution parameters is a challenging problem. We
analyze existing solutions including finite-difference (FD) estimators and
continuous relaxation (CR) estimators in terms of bias and variance. We show
that the commonly used Gumbel-Softmax estimator is biased and propose a simple
method to reduce it. We also derive a simpler piece-wise linear continuous
relaxation that also possesses reduced bias. We demonstrate empirically that
reduced bias leads to a better performance in variational inference and on
binary optimization tasks
Undirected Graphical Models as Approximate Posteriors
The representation of the approximate posterior is a critical aspect of
effective variational autoencoders (VAEs). Poor choices for the approximate
posterior have a detrimental impact on the generative performance of VAEs due
to the mismatch with the true posterior. We extend the class of posterior
models that may be learned by using undirected graphical models. We develop an
efficient method to train undirected approximate posteriors by showing that the
gradient of the training objective with respect to the parameters of the
undirected posterior can be computed by backpropagation through Markov chain
Monte Carlo updates. We apply these gradient estimators for training discrete
VAEs with Boltzmann machines as approximate posteriors and demonstrate that
undirected models outperform previous results obtained using directed graphical
models. Our implementation is available at https://github.com/QuadrantAI/dvaess .Comment: Accepted to ICML 202
ARSM: Augment-REINFORCE-Swap-Merge Estimator for Gradient Backpropagation Through Categorical Variables
To address the challenge of backpropagating the gradient through categorical
variables, we propose the augment-REINFORCE-swap-merge (ARSM) gradient
estimator that is unbiased and has low variance. ARSM first uses variable
augmentation, REINFORCE, and Rao-Blackwellization to re-express the gradient as
an expectation under the Dirichlet distribution, then uses variable swapping to
construct differently expressed but equivalent expectations, and finally shares
common random numbers between these expectations to achieve significant
variance reduction. Experimental results show ARSM closely resembles the
performance of the true gradient for optimization in univariate settings;
outperforms existing estimators by a large margin when applied to categorical
variational auto-encoders; and provides a "try-and-see self-critic" variance
reduction method for discrete-action policy gradient, which removes the need of
estimating baselines by generating a random number of pseudo actions and
estimating their action-value functions.Comment: Published in ICML 2019. We have updated Section 4.2 and the Appendix
to reflect the improvements brought by fixing some bugs hidden in our
original code. Please find the Errata in the authors' websites and check the
updated code in Githu