Search CORE

3 research outputs found

A Wasserstein Minimum Velocity Approach to Learning Unnormalized Models

Author: Cheng Shuyu
Li Yueru
Wang Ziyu
Zhang Bo
Zhu Jun
Publication venue
Publication date: 18/02/2020
Field of study

Score matching provides an effective approach to learning flexible unnormalized models, but its scalability is limited by the need to evaluate a second-order derivative. In this paper, we present a scalable approximation to a general family of learning objectives including score matching, by observing a new connection between these objectives and Wasserstein gradient flows. We present applications with promise in learning neural density estimators on manifolds, and training implicit variational and Wasserstein auto-encoders with a manifold-valued prior.Comment: AISTATS 202

arXiv.org e-Print Archive

How to Train Your Energy-Based Models

Author: Kingma Diederik P.
Song Yang
Publication venue
Publication date: 17/02/2021
Field of study

Energy-Based Models (EBMs), also known as non-normalized probabilistic models, specify probability density or mass functions up to an unknown normalizing constant. Unlike most other probabilistic models, EBMs do not place a restriction on the tractability of the normalizing constant, thus are more flexible to parameterize and can model a more expressive family of probability distributions. However, the unknown normalizing constant of EBMs makes training particularly difficult. Our goal is to provide a friendly introduction to modern approaches for EBM training. We start by explaining maximum likelihood training with Markov chain Monte Carlo (MCMC), and proceed to elaborate on MCMC-free approaches, including Score Matching (SM) and Noise Constrastive Estimation (NCE). We highlight theoretical connections among these three approaches, and end with a brief survey on alternative training methods, which are still under active research. Our tutorial is targeted at an audience with basic understanding of generative models who want to apply EBMs or start a research project in this direction

arXiv.org e-Print Archive

Efficient Learning of Generative Models via Finite-Difference Score Matching

Author: Ermon Stefano
Li Chongxuan
Pang Tianyu
Song Yang
Xu Kun
Zhu Jun
Publication venue
Publication date: 25/11/2020
Field of study

Several machine learning applications involve the optimization of higher-order derivatives (e.g., gradients of gradients) during training, which can be expensive in respect to memory and computation even with automatic differentiation. As a typical example in generative modeling, score matching (SM) involves the optimization of the trace of a Hessian. To improve computing efficiency, we rewrite the SM objective and its variants in terms of directional derivatives, and present a generic strategy to efficiently approximate any-order directional derivative with finite difference (FD). Our approximation only involves function evaluations, which can be executed in parallel, and no gradient computations. Thus, it reduces the total computational cost while also improving numerical stability. We provide two instantiations by reformulating variants of SM objectives into the FD forms. Empirically, we demonstrate that our methods produce results comparable to the gradient-based counterparts while being much more computationally efficient.Comment: NeurIPS 202

arXiv.org e-Print Archive