1,441 research outputs found
Marginal Weighted Maximum Log-likelihood for Efficient Learning of Perturb-and-Map models
International audienceWe consider the structured-output prediction problem through probabilistic approaches and generalize the “perturb-and-MAP” framework to more challenging weighted Hamming losses, which are crucial in applications. While in principle our approach is a straightforward marginalization, it requires solving many related MAP inference problems. We show that for log-supermodular pairwise models these operations can be performed efficiently using the machinery of dynamic graph cuts. We also propose to use double stochastic gradient descent, both on the data and on the perturbations, for efficient learning. Our framework can naturally take weak supervision (e.g., partial labels) into account. We conduct a set of experiments on medium-scale character recognition and image segmentation, showing the benefits of our algorithms
On Sampling from the Gibbs Distribution with Random Maximum A-Posteriori Perturbations
In this paper we describe how MAP inference can be used to sample efficiently
from Gibbs distributions. Specifically, we provide means for drawing either
approximate or unbiased samples from Gibbs' distributions by introducing low
dimensional perturbations and solving the corresponding MAP assignments. Our
approach also leads to new ways to derive lower bounds on partition functions.
We demonstrate empirically that our method excels in the typical "high signal -
high coupling" regime. The setting results in ragged energy landscapes that are
challenging for alternative approaches to sampling and/or lower bounds
Herding as a Learning System with Edge-of-Chaos Dynamics
Herding defines a deterministic dynamical system at the edge of chaos. It
generates a sequence of model states and parameters by alternating parameter
perturbations with state maximizations, where the sequence of states can be
interpreted as "samples" from an associated MRF model. Herding differs from
maximum likelihood estimation in that the sequence of parameters does not
converge to a fixed point and differs from an MCMC posterior sampling approach
in that the sequence of states is generated deterministically. Herding may be
interpreted as a"perturb and map" method where the parameter perturbations are
generated using a deterministic nonlinear dynamical system rather than randomly
from a Gumbel distribution. This chapter studies the distinct statistical
characteristics of the herding algorithm and shows that the fast convergence
rate of the controlled moments may be attributed to edge of chaos dynamics. The
herding algorithm can also be generalized to models with latent variables and
to a discriminative learning setting. The perceptron cycling theorem ensures
that the fast moment matching property is preserved in the more general
framework
Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions
Combining discrete probability distributions and combinatorial optimization
problems with neural network components has numerous applications but poses
several challenges. We propose Implicit Maximum Likelihood Estimation (I-MLE),
a framework for end-to-end learning of models combining discrete exponential
family distributions and differentiable neural components. I-MLE is widely
applicable as it only requires the ability to compute the most probable states
and does not rely on smooth relaxations. The framework encompasses several
approaches such as perturbation-based implicit differentiation and recent
methods to differentiate through black-box combinatorial solvers. We introduce
a novel class of noise distributions for approximating marginals via
perturb-and-MAP. Moreover, we show that I-MLE simplifies to maximum likelihood
estimation when used in some recently studied learning settings that involve
combinatorial solvers. Experiments on several datasets suggest that I-MLE is
competitive with and often outperforms existing approaches which rely on
problem-specific relaxations.Comment: NeurIPS 2021 camera-ready; repo:
https://github.com/nec-research/tf-iml
- …