17 research outputs found

    Denoising Diffusion Samplers

    Full text link
    Denoising diffusion models are a popular class of generative models providing state-of-the-art results in many domains. One adds gradually noise to data using a diffusion to transform the data distribution into a Gaussian distribution. Samples from the generative model are then obtained by simulating an approximation of the time-reversal of this diffusion initialized by Gaussian samples. Practically, the intractable score terms appearing in the time-reversed process are approximated using score matching techniques. We explore here a similar idea to sample approximately from unnormalized probability density functions and estimate their normalizing constants. We consider a process where the target density diffuses towards a Gaussian. Denoising Diffusion Samplers (DDS) are obtained by approximating the corresponding time-reversal. While score matching is not applicable in this context, we can leverage many of the ideas introduced in generative modeling for Monte Carlo sampling. Existing theoretical results from denoising diffusion models also provide theoretical guarantees for DDS. We discuss the connections between DDS, optimal control and Schr\"odinger bridges and finally demonstrate DDS experimentally on a variety of challenging sampling tasks.Comment: In The Eleventh International Conference on Learning Representations, 202

    Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC

    Full text link
    Since their introduction, diffusion models have quickly become the prevailing approach to generative modeling in many domains. They can be interpreted as learning the gradients of a time-varying sequence of log-probability density functions. This interpretation has motivated classifier-based and classifier-free guidance as methods for post-hoc control of diffusion models. In this work, we build upon these ideas using the score-based interpretation of diffusion models, and explore alternative ways to condition, modify, and reuse diffusion models for tasks involving compositional generation and guidance. In particular, we investigate why certain types of composition fail using current techniques and present a number of solutions. We conclude that the sampler (not the model) is responsible for this failure and propose new samplers, inspired by MCMC, which enable successful compositional generation. Further, we propose an energy-based parameterization of diffusion models which enables the use of new compositional operators and more sophisticated, Metropolis-corrected samplers. Intriguingly we find these samplers lead to notable improvements in compositional generation across a wide set of problems such as classifier-guided ImageNet modeling and compositional text-to-image generation.Comment: ICML 2023, Project Webpage: https://energy-based-model.github.io/reduce-reuse-recycle

    Self-conditioned Embedding Diffusion for Text Generation

    Full text link
    Can continuous diffusion models bring the same performance breakthrough on natural language they did for image generation? To circumvent the discrete nature of text data, we can simply project tokens in a continuous space of embeddings, as is standard in language modeling. We propose Self-conditioned Embedding Diffusion, a continuous diffusion mechanism that operates on token embeddings and allows to learn flexible and scalable diffusion models for both conditional and unconditional text generation. Through qualitative and quantitative evaluation, we show that our text diffusion models generate samples comparable with those produced by standard autoregressive language models - while being in theory more efficient on accelerator hardware at inference time. Our work paves the way for scaling up diffusion models for text, similarly to autoregressive models, and for improving performance with recent refinements to continuous diffusion.Comment: 15 page

    Applications and Methods for Energy-based Models at Scale

    No full text
    Energy-Based Models (EBMs) are a class of generative models like Variational Autoencoders, Normalizing Flows, and Autoregressive Models. It is a commonly held belief that generative models like these can help to improve downstream discriminative machine learning applications. Generative models present an avenue for learning about underlying low-dimensional structure hidden within high-dimensional datasets and they can be trained on unlabeled data, enabling a pathway for building more label-efficient learning systems. Unfortunately, this dream has not been fully realized as most classes of generative models perform poorly at discriminative applications. EBMs parameterize probability distributions in a fundamentally different way than other generative models which allows them to be more expressive and have more architectural flexibility. We demonstrate that we can take advantage of this additional freedom to apply EBMs successfully to downstream discriminative tasks, notably improving performance over alternative classes of generative models and many other baselines. Unfortunately, this freedom comes at a price, and in practice, EBMs are notoriously difficult to work with, train, scale, and evaluate. The remainder of the thesis covers my work to address these issues. In particular, we explore the use of alternative (non-KL) divergences for EBM training and evaluation. Next, we explore the use of generators to improve EBM training and reduce their dependency on MCMC sampling. Finally, we present a new approach to sampling from discrete distributions which enables recently developed methods for training EBMs on continuous data to be applied to discrete data as well.Ph.D
    corecore