This paper introduces Wasserstein variational inference, a new form of
approximate Bayesian inference based on optimal transport theory. Wasserstein
variational inference uses a new family of divergences that includes both
f-divergences and the Wasserstein distance as special cases. The gradients of
the Wasserstein variational loss are obtained by backpropagating through the
Sinkhorn iterations. This technique results in a very stable likelihood-free
training method that can be used with implicit distributions and probabilistic
programs. Using the Wasserstein variational inference framework, we introduce
several new forms of autoencoders and test their robustness and performance
against existing variational autoencoding techniques.Comment: 8 pages, 1 figur