44 research outputs found
Bridging the Gap Between Variational Inference and Wasserstein Gradient Flows
Variational inference is a technique that approximates a target distribution
by optimizing within the parameter space of variational families. On the other
hand, Wasserstein gradient flows describe optimization within the space of
probability measures where they do not necessarily admit a parametric density
function. In this paper, we bridge the gap between these two methods. We
demonstrate that, under certain conditions, the Bures-Wasserstein gradient flow
can be recast as the Euclidean gradient flow where its forward Euler scheme is
the standard black-box variational inference algorithm. Specifically, the
vector field of the gradient flow is generated via the path-derivative gradient
estimator. We also offer an alternative perspective on the path-derivative
gradient, framing it as a distillation procedure to the Wasserstein gradient
flow. Distillations can be extended to encompass -divergences and
non-Gaussian variational families. This extension yields a new gradient
estimator for -divergences, readily implementable using contemporary machine
learning libraries like PyTorch or TensorFlow
Particle-based Variational Inference with Preconditioned Functional Gradient Flow
Particle-based variational inference (VI) minimizes the KL divergence between
model samples and the target posterior with gradient flow estimates. With the
popularity of Stein variational gradient descent (SVGD), the focus of
particle-based VI algorithms has been on the properties of functions in
Reproducing Kernel Hilbert Space (RKHS) to approximate the gradient flow.
However, the requirement of RKHS restricts the function class and algorithmic
flexibility. This paper remedies the problem by proposing a general framework
to obtain tractable functional gradient flow estimates. The functional gradient
flow in our framework can be defined by a general functional regularization
term that includes the RKHS norm as a special case. We use our framework to
propose a new particle-based VI algorithm: preconditioned functional gradient
flow (PFG). Compared with SVGD, the proposed method has several advantages:
larger function class; greater scalability in large particle-size scenarios;
better adaptation to ill-conditioned distributions; provable continuous-time
convergence in KL divergence. Non-linear function classes such as neural
networks can be incorporated to estimate the gradient flow. Both theory and
experiments have shown the effectiveness of our framework.Comment: 34 pages, 8 figure
Operator-splitting schemes for degenerate, non-local, conservative-dissipative systems
In this paper, we develop a natural operator-splitting variational scheme for
a general class of non-local, degenerate conservative-dissipative evolutionary
equations. The splitting-scheme consists of two phases: a conservative
(transport) phase and a dissipative (diffusion) phase. The first phase is
solved exactly using the method of characteristic and DiPerna-Lions theory
while the second phase is solved approximately using a JKO-type variational
scheme that minimizes an energy functional with respect to a certain
Kantorovich optimal transport cost functional. In addition, we also introduce
an entropic-regularisation of the scheme. We prove the convergence of both
schemes to a weak solution of the evolutionary equation. We illustrate the
generality of our work by providing a number of examples, including the kinetic
Fokker-Planck equation and the (regularized) Vlasov-Poisson-Fokker-Planck
equation.Comment: 26 pages. significant revision from the previous versio