17,529 research outputs found
Approximate Message Passing with Restricted Boltzmann Machine Priors
Approximate Message Passing (AMP) has been shown to be an excellent
statistical approach to signal inference and compressed sensing problem. The
AMP framework provides modularity in the choice of signal prior; here we
propose a hierarchical form of the Gauss-Bernouilli prior which utilizes a
Restricted Boltzmann Machine (RBM) trained on the signal support to push
reconstruction performance beyond that of simple iid priors for signals whose
support can be well represented by a trained binary RBM. We present and analyze
two methods of RBM factorization and demonstrate how these affect signal
reconstruction performance within our proposed algorithm. Finally, using the
MNIST handwritten digit dataset, we show experimentally that using an RBM
allows AMP to approach oracle-support performance
Neural Networks retrieving Boolean patterns in a sea of Gaussian ones
Restricted Boltzmann Machines are key tools in Machine Learning and are
described by the energy function of bipartite spin-glasses. From a statistical
mechanical perspective, they share the same Gibbs measure of Hopfield networks
for associative memory. In this equivalence, weights in the former play as
patterns in the latter. As Boltzmann machines usually require real weights to
be trained with gradient descent like methods, while Hopfield networks
typically store binary patterns to be able to retrieve, the investigation of a
mixed Hebbian network, equipped with both real (e.g., Gaussian) and discrete
(e.g., Boolean) patterns naturally arises. We prove that, in the challenging
regime of a high storage of real patterns, where retrieval is forbidden, an
extra load of Boolean patterns can still be retrieved, as long as the ratio
among the overall load and the network size does not exceed a critical
threshold, that turns out to be the same of the standard
Amit-Gutfreund-Sompolinsky theory. Assuming replica symmetry, we study the case
of a low load of Boolean patterns combining the stochastic stability and
Hamilton-Jacobi interpolating techniques. The result can be extended to the
high load by a non rigorous but standard replica computation argument.Comment: 16 pages, 1 figur
Free energies of Boltzmann Machines: self-averaging, annealed and replica symmetric approximations in the thermodynamic limit
Restricted Boltzmann machines (RBMs) constitute one of the main models for
machine statistical inference and they are widely employed in Artificial
Intelligence as powerful tools for (deep) learning. However, in contrast with
countless remarkable practical successes, their mathematical formalization has
been largely elusive: from a statistical-mechanics perspective these systems
display the same (random) Gibbs measure of bi-partite spin-glasses, whose
rigorous treatment is notoriously difficult. In this work, beyond providing a
brief review on RBMs from both the learning and the retrieval perspectives, we
aim to contribute to their analytical investigation, by considering two
distinct realizations of their weights (i.e., Boolean and Gaussian) and
studying the properties of their related free energies. More precisely,
focusing on a RBM characterized by digital couplings, we first extend the
Pastur-Shcherbina-Tirozzi method (originally developed for the Hopfield model)
to prove the self-averaging property for the free energy, over its quenched
expectation, in the infinite volume limit, then we explicitly calculate its
simplest approximation, namely its annealed bound. Next, focusing on a RBM
characterized by analogical weights, we extend Guerra's interpolating scheme to
obtain a control of the quenched free-energy under the assumption of replica
symmetry: we get self-consistencies for the order parameters (in full agreement
with the existing Literature) as well as the critical line for ergodicity
breaking that turns out to be the same obtained in AGS theory. As we discuss,
this analogy stems from the slow-noise universality. Finally, glancing beyond
replica symmetry, we analyze the fluctuations of the overlaps for an estimate
of the (slow) noise affecting the retrieval of the signal, and by a stability
analysis we recover the Aizenman-Contucci identities typical of glassy systems.Comment: 21 pages, 1 figur
A Deterministic and Generalized Framework for Unsupervised Learning with Restricted Boltzmann Machines
Restricted Boltzmann machines (RBMs) are energy-based neural-networks which
are commonly used as the building blocks for deep architectures neural
architectures. In this work, we derive a deterministic framework for the
training, evaluation, and use of RBMs based upon the Thouless-Anderson-Palmer
(TAP) mean-field approximation of widely-connected systems with weak
interactions coming from spin-glass theory. While the TAP approach has been
extensively studied for fully-visible binary spin systems, our construction is
generalized to latent-variable models, as well as to arbitrarily distributed
real-valued spin systems with bounded support. In our numerical experiments, we
demonstrate the effective deterministic training of our proposed models and are
able to show interesting features of unsupervised learning which could not be
directly observed with sampling. Additionally, we demonstrate how to utilize
our TAP-based framework for leveraging trained RBMs as joint priors in
denoising problems
Boosting Monte Carlo simulations of spin glasses using autoregressive neural networks
The autoregressive neural networks are emerging as a powerful computational
tool to solve relevant problems in classical and quantum mechanics. One of
their appealing functionalities is that, after they have learned a probability
distribution from a dataset, they allow exact and efficient sampling of typical
system configurations. Here we employ a neural autoregressive distribution
estimator (NADE) to boost Markov chain Monte Carlo (MCMC) simulations of a
paradigmatic classical model of spin-glass theory, namely the two-dimensional
Edwards-Anderson Hamiltonian. We show that a NADE can be trained to accurately
mimic the Boltzmann distribution using unsupervised learning from system
configurations generated using standard MCMC algorithms. The trained NADE is
then employed as smart proposal distribution for the Metropolis-Hastings
algorithm. This allows us to perform efficient MCMC simulations, which provide
unbiased results even if the expectation value corresponding to the probability
distribution learned by the NADE is not exact. Notably, we implement a
sequential tempering procedure, whereby a NADE trained at a higher temperature
is iteratively employed as proposal distribution in a MCMC simulation run at a
slightly lower temperature. This allows one to efficiently simulate the
spin-glass model even in the low-temperature regime, avoiding the divergent
correlation times that plague MCMC simulations driven by local-update
algorithms. Furthermore, we show that the NADE-driven simulations quickly
sample ground-state configurations, paving the way to their future utilization
to tackle binary optimization problems.Comment: 13 pages, 14 figure
Quantum-Assisted Learning of Hardware-Embedded Probabilistic Graphical Models
Mainstream machine-learning techniques such as deep learning and
probabilistic programming rely heavily on sampling from generally intractable
probability distributions. There is increasing interest in the potential
advantages of using quantum computing technologies as sampling engines to speed
up these tasks or to make them more effective. However, some pressing
challenges in state-of-the-art quantum annealers have to be overcome before we
can assess their actual performance. The sparse connectivity, resulting from
the local interaction between quantum bits in physical hardware
implementations, is considered the most severe limitation to the quality of
constructing powerful generative unsupervised machine-learning models. Here we
use embedding techniques to add redundancy to data sets, allowing us to
increase the modeling capacity of quantum annealers. We illustrate our findings
by training hardware-embedded graphical models on a binarized data set of
handwritten digits and two synthetic data sets in experiments with up to 940
quantum bits. Our model can be trained in quantum hardware without full
knowledge of the effective parameters specifying the corresponding quantum
Gibbs-like distribution; therefore, this approach avoids the need to infer the
effective temperature at each iteration, speeding up learning; it also
mitigates the effect of noise in the control parameters, making it robust to
deviations from the reference Gibbs distribution. Our approach demonstrates the
feasibility of using quantum annealers for implementing generative models, and
it provides a suitable framework for benchmarking these quantum technologies on
machine-learning-related tasks.Comment: 17 pages, 8 figures. Minor further revisions. As published in Phys.
Rev.
- …