17 research outputs found
Denoising Diffusion Samplers
Denoising diffusion models are a popular class of generative models providing
state-of-the-art results in many domains. One adds gradually noise to data
using a diffusion to transform the data distribution into a Gaussian
distribution. Samples from the generative model are then obtained by simulating
an approximation of the time-reversal of this diffusion initialized by Gaussian
samples. Practically, the intractable score terms appearing in the
time-reversed process are approximated using score matching techniques. We
explore here a similar idea to sample approximately from unnormalized
probability density functions and estimate their normalizing constants. We
consider a process where the target density diffuses towards a Gaussian.
Denoising Diffusion Samplers (DDS) are obtained by approximating the
corresponding time-reversal. While score matching is not applicable in this
context, we can leverage many of the ideas introduced in generative modeling
for Monte Carlo sampling. Existing theoretical results from denoising diffusion
models also provide theoretical guarantees for DDS. We discuss the connections
between DDS, optimal control and Schr\"odinger bridges and finally demonstrate
DDS experimentally on a variety of challenging sampling tasks.Comment: In The Eleventh International Conference on Learning Representations,
202
Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC
Since their introduction, diffusion models have quickly become the prevailing
approach to generative modeling in many domains. They can be interpreted as
learning the gradients of a time-varying sequence of log-probability density
functions. This interpretation has motivated classifier-based and
classifier-free guidance as methods for post-hoc control of diffusion models.
In this work, we build upon these ideas using the score-based interpretation of
diffusion models, and explore alternative ways to condition, modify, and reuse
diffusion models for tasks involving compositional generation and guidance. In
particular, we investigate why certain types of composition fail using current
techniques and present a number of solutions. We conclude that the sampler (not
the model) is responsible for this failure and propose new samplers, inspired
by MCMC, which enable successful compositional generation. Further, we propose
an energy-based parameterization of diffusion models which enables the use of
new compositional operators and more sophisticated, Metropolis-corrected
samplers. Intriguingly we find these samplers lead to notable improvements in
compositional generation across a wide set of problems such as
classifier-guided ImageNet modeling and compositional text-to-image generation.Comment: ICML 2023, Project Webpage:
https://energy-based-model.github.io/reduce-reuse-recycle
Self-conditioned Embedding Diffusion for Text Generation
Can continuous diffusion models bring the same performance breakthrough on
natural language they did for image generation? To circumvent the discrete
nature of text data, we can simply project tokens in a continuous space of
embeddings, as is standard in language modeling. We propose Self-conditioned
Embedding Diffusion, a continuous diffusion mechanism that operates on token
embeddings and allows to learn flexible and scalable diffusion models for both
conditional and unconditional text generation. Through qualitative and
quantitative evaluation, we show that our text diffusion models generate
samples comparable with those produced by standard autoregressive language
models - while being in theory more efficient on accelerator hardware at
inference time. Our work paves the way for scaling up diffusion models for
text, similarly to autoregressive models, and for improving performance with
recent refinements to continuous diffusion.Comment: 15 page
Applications and Methods for Energy-based Models at Scale
Energy-Based Models (EBMs) are a class of generative models like Variational Autoencoders, Normalizing Flows, and Autoregressive Models. It is a commonly held belief that generative models like these can help to improve downstream discriminative machine learning applications. Generative models present an avenue for learning about underlying low-dimensional structure hidden within high-dimensional datasets and they can be trained on unlabeled data, enabling a pathway for building more label-efficient learning systems. Unfortunately, this dream has not been fully realized as most classes of generative models perform poorly at discriminative applications. EBMs parameterize probability distributions in a fundamentally different way than other generative models which allows them to be more expressive and have more architectural flexibility. We demonstrate that we can take advantage of this additional freedom to apply EBMs successfully to downstream discriminative tasks, notably improving performance over alternative classes of generative models and many other baselines. Unfortunately, this freedom comes at a price, and in practice, EBMs are notoriously difficult to work with, train, scale, and evaluate. The remainder of the thesis covers my work to address these issues. In particular, we explore the use of alternative (non-KL) divergences for EBM training and evaluation. Next, we explore the use of generators to improve EBM training and reduce their dependency on MCMC sampling. Finally, we present a new approach to sampling from discrete distributions which enables recently developed methods for training EBMs on continuous data to be applied to discrete data as well.Ph.D