2,598 research outputs found
Exponential Family Estimation via Adversarial Dynamics Embedding
We present an efficient algorithm for maximum likelihood estimation (MLE) of exponential family models, with a general parametrization of the energy function that includes neural networks. We exploit the primal-dual view of the MLE with a kinetics augmented model to obtain an estimate associated with an adversarial dual sampler. To represent this sampler, we introduce a novel neural architecture, dynamics embedding, that generalizes Hamiltonian Monte-Carlo (HMC). The proposed approach inherits the flexibility of HMC while enabling tractable entropy estimation for the augmented model. By learning both a dual sampler and the primal model simultaneously, and sharing parameters between them, we obviate the requirement to design a separate sampling procedure once the model has been trained, leading to more effective learning. We show that many existing estimators, such as contrastive divergence, pseudo/composite-likelihood, score matching, minimum Stein discrepancy estimator, non-local contrastive objectives, noise-contrastive estimation, and minimum probability flow, are special cases of the proposed approach, each expressed by a different (fixed) dual sampler. An empirical investigation shows that adapting the sampler during MLE can significantly improve on state-of-the-art estimators
Self-Adversarially Learned Bayesian Sampling
Scalable Bayesian sampling is playing an important role in modern machine
learning, especially in the fast-developed unsupervised-(deep)-learning models.
While tremendous progresses have been achieved via scalable Bayesian sampling
such as stochastic gradient MCMC (SG-MCMC) and Stein variational gradient
descent (SVGD), the generated samples are typically highly correlated.
Moreover, their sample-generation processes are often criticized to be
inefficient. In this paper, we propose a novel self-adversarial learning
framework that automatically learns a conditional generator to mimic the
behavior of a Markov kernel (transition kernel). High-quality samples can be
efficiently generated by direct forward passes though a learned generator. Most
importantly, the learning process adopts a self-learning paradigm, requiring no
information on existing Markov kernels, e.g., knowledge of how to draw samples
from them. Specifically, our framework learns to use current samples, either
from the generator or pre-provided training data, to update the generator such
that the generated samples progressively approach a target distribution, thus
it is called self-learning. Experiments on both synthetic and real datasets
verify advantages of our framework, outperforming related methods in terms of
both sampling efficiency and sample quality.Comment: AAAI 201
- …