462 research outputs found
Manifold Diffusion Fields
We present Manifold Diffusion Fields (MDF), an approach that unlocks learning
of diffusion models of data in general non-Euclidean geometries. Leveraging
insights from spectral geometry analysis, we define an intrinsic coordinate
system on the manifold via the eigen-functions of the Laplace-Beltrami
Operator. MDF represents functions using an explicit parametrization formed by
a set of multiple input-output pairs. Our approach allows to sample continuous
functions on manifolds and is invariant with respect to rigid and isometric
transformations of the manifold. In addition, we show that MDF generalizes to
the case where the training set contains functions on different manifolds.
Empirical results on multiple datasets and manifolds including challenging
scientific problems like weather prediction or molecular conformation show that
MDF can capture distributions of such functions with better diversity and
fidelity than previous approaches.Comment: ICLR24 pape
Efficient Optimization of Loops and Limits with Randomized Telescoping Sums
We consider optimization problems in which the objective requires an inner
loop with many steps or is the limit of a sequence of increasingly costly
approximations. Meta-learning, training recurrent neural networks, and
optimization of the solutions to differential equations are all examples of
optimization problems with this character. In such problems, it can be
expensive to compute the objective function value and its gradient, but
truncating the loop or using less accurate approximations can induce biases
that damage the overall solution. We propose randomized telescope (RT) gradient
estimators, which represent the objective as the sum of a telescoping series
and sample linear combinations of terms to provide cheap unbiased gradient
estimates. We identify conditions under which RT estimators achieve
optimization convergence rates independent of the length of the loop or the
required accuracy of the approximation. We also derive a method for tuning RT
estimators online to maximize a lower bound on the expected decrease in loss
per unit of computation. We evaluate our adaptive RT estimators on a range of
applications including meta-optimization of learning rates, variational
inference of ODE parameters, and training an LSTM to model long sequences
De-randomizing MCMC dynamics with the diffusion Stein operator
Publisher Copyright: © 2021 Neural information processing systems foundation. All rights reserved.Approximate Bayesian inference estimates descriptors of an intractable target distribution - in essence, an optimization problem within a family of distributions. For example, Langevin dynamics (LD) extracts asymptotically exact samples from a diffusion process because the time evolution of its marginal distributions constitutes a curve that minimizes the KL-divergence via steepest descent in the Wasserstein space. Parallel to LD, Stein variational gradient descent (SVGD) similarly minimizes the KL, albeit endowed with a novel Stein-Wasserstein distance, by deterministically transporting a set of particle samples, thus de-randomizes the stochastic diffusion process. We propose de-randomized kernel-based particle samplers to all diffusion-based samplers known as MCMC dynamics. Following previous work in interpreting MCMC dynamics, we equip the Stein-Wasserstein metric with a fiber-Riemannian Poisson structure, with the capacity of characterizing a fiber-gradient Hamiltonian flow that simulates MCMC dynamics. Such dynamics discretize into generalized SVGD (GSVGD), a Stein-type deterministic particle sampler, with particle updates coinciding with applying the diffusion Stein operator to a kernel function. We demonstrate empirically that GSVGD can de-randomize complicated MCMC dynamics, which combine the advantages of auxiliary momentum variables and Riemannian structure, while maintaining the high sample quality from an interacting particle system.Peer reviewe
- …