6 research outputs found
Pattern reconstruction with restricted Boltzmann machines
Restricted Boltzmann machines are energy models made of a visible and a
hidden layer. We identify an effective energy function describing the
zero-temperature landscape on the visible units and depending only on the tail
behaviour of the hidden layer prior distribution. Studying the location of the
local minima of such an energy function, we show that the ability of a
restricted Boltzmann machine to reconstruct a random pattern depends indeed
only on the tail of the hidden prior distribution. We find that hidden priors
with strictly super-Gaussian tails give only a logarithmic loss in pattern
retrieval, while an efficient retrieval is much harder with hidden units with
strictly sub-Gaussian tails; if the hidden prior has Gaussian tails, the
retrieval capability is determined by the number of hidden units (as in the
Hopfield model)
Learning Exponential Family Graphical Models with Latent Variables using Regularized Conditional Likelihood
Fitting a graphical model to a collection of random variables given sample observations is a challenging task if the observed variables are influenced by latent variables, which can induce significant confounding statistical dependencies among the observed variables. We present a new convex relaxation framework based on regularized conditional likelihood for latent-variable graphical modeling in which the conditional distribution of the observed variables conditioned on the latent variables is given by an exponential family graphical model. In comparison to previously proposed tractable methods that proceed by characterizing the marginal distribution of the observed variables, our approach is applicable in a broader range of settings as it does not require knowledge about the specific form of distribution of the latent variables and it can be specialized to yield tractable approaches to problems in which the observed data are not well-modeled as Gaussian. We demonstrate the utility and flexibility of our framework via a series of numerical experiments on synthetic as well as real data
A Unified Approach to Learning Ising Models: Beyond Independence and Bounded Width
We revisit the problem of efficiently learning the underlying parameters of
Ising models from data. Current algorithmic approaches achieve essentially
optimal sample complexity when given i.i.d. samples from the stationary measure
and the underlying model satisfies "width" bounds on the total
interaction involving each node. We show that a simple existing approach based
on node-wise logistic regression provably succeeds at recovering the underlying
model in several new settings where these assumptions are violated:
(1) Given dynamically generated data from a wide variety of local Markov
chains, like block or round-robin dynamics, logistic regression recovers the
parameters with optimal sample complexity up to factors. This
generalizes the specialized algorithm of Bresler, Gamarnik, and Shah [IEEE
Trans. Inf. Theory'18] for structure recovery in bounded degree graphs from
Glauber dynamics.
(2) For the Sherrington-Kirkpatrick model of spin glasses, given
independent samples, logistic regression recovers the
parameters in most of the known high-temperature regime via a simple reduction
to weaker structural properties of the measure. This improves on recent work of
Anari, Jain, Koehler, Pham, and Vuong [ArXiv'23] which gives distribution
learning at higher temperature.
(3) As a simple byproduct of our techniques, logistic regression achieves an
exponential improvement in learning from samples in the M-regime of data
considered by Dutt, Lokhov, Vuffray, and Misra [ICML'21] as well as novel
guarantees for learning from the adversarial Glauber dynamics of Chin, Moitra,
Mossel, and Sandon [ArXiv'23].
Our approach thus significantly generalizes the elegant analysis of Wu,
Sanghavi, and Dimakis [Neurips'19] without any algorithmic modification.Comment: 51 page