Search CORE

17,529 research outputs found

Approximate Message Passing with Restricted Boltzmann Machine Priors

Author: Drémeau Angélique
Krzakala Florent
Tramel Eric W.
Publication venue: 'IOP Publishing'
Publication date: 09/12/2015
Field of study

Approximate Message Passing (AMP) has been shown to be an excellent statistical approach to signal inference and compressed sensing problem. The AMP framework provides modularity in the choice of signal prior; here we propose a hierarchical form of the Gauss-Bernouilli prior which utilizes a Restricted Boltzmann Machine (RBM) trained on the signal support to push reconstruction performance beyond that of simple iid priors for signals whose support can be well represented by a trained binary RBM. We present and analyze two methods of RBM factorization and demonstrate how these affect signal reconstruction performance within our proposed algorithm. Finally, using the MNIST handwritten digit dataset, we show experimentally that using an RBM allows AMP to approach oracle-support performance

arXiv.org e-Print Archive

Crossref

HAL-Université de Bretagne Occidentale

Neural Networks retrieving Boolean patterns in a sea of Gaussian ones

Author: Agliari Elena
Barra Adriano
Longo Chiara
Tantari Daniele
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Restricted Boltzmann Machines are key tools in Machine Learning and are described by the energy function of bipartite spin-glasses. From a statistical mechanical perspective, they share the same Gibbs measure of Hopfield networks for associative memory. In this equivalence, weights in the former play as patterns in the latter. As Boltzmann machines usually require real weights to be trained with gradient descent like methods, while Hopfield networks typically store binary patterns to be able to retrieve, the investigation of a mixed Hebbian network, equipped with both real (e.g., Gaussian) and discrete (e.g., Boolean) patterns naturally arises. We prove that, in the challenging regime of a high storage of real patterns, where retrieval is forbidden, an extra load of Boolean patterns can still be retrieved, as long as the ratio among the overall load and the network size does not exceed a critical threshold, that turns out to be the same of the standard Amit-Gutfreund-Sompolinsky theory. Assuming replica symmetry, we study the case of a low load of Boolean patterns combining the stochastic stability and Hamilton-Jacobi interpolating techniques. The result can be extended to the high load by a non rigorous but standard replica computation argument.Comment: 16 pages, 1 figur

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della Ricerca - Scuola Normale Superiore

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Archivio della ricerca- Università di Roma La Sapienza

Archivio Istituzionale della Ricerca- Università del Salento

Free energies of Boltzmann Machines: self-averaging, annealed and replica symmetric approximations in the thermodynamic limit

Author: Agliari Elena
Barra Adriano
Tirozzi Brunello
Publication venue: 'IOP Publishing'
Publication date: 01/01/2019
Field of study

Restricted Boltzmann machines (RBMs) constitute one of the main models for machine statistical inference and they are widely employed in Artificial Intelligence as powerful tools for (deep) learning. However, in contrast with countless remarkable practical successes, their mathematical formalization has been largely elusive: from a statistical-mechanics perspective these systems display the same (random) Gibbs measure of bi-partite spin-glasses, whose rigorous treatment is notoriously difficult. In this work, beyond providing a brief review on RBMs from both the learning and the retrieval perspectives, we aim to contribute to their analytical investigation, by considering two distinct realizations of their weights (i.e., Boolean and Gaussian) and studying the properties of their related free energies. More precisely, focusing on a RBM characterized by digital couplings, we first extend the Pastur-Shcherbina-Tirozzi method (originally developed for the Hopfield model) to prove the self-averaging property for the free energy, over its quenched expectation, in the infinite volume limit, then we explicitly calculate its simplest approximation, namely its annealed bound. Next, focusing on a RBM characterized by analogical weights, we extend Guerra's interpolating scheme to obtain a control of the quenched free-energy under the assumption of replica symmetry: we get self-consistencies for the order parameters (in full agreement with the existing Literature) as well as the critical line for ergodicity breaking that turns out to be the same obtained in AGS theory. As we discuss, this analogy stems from the slow-noise universality. Finally, glancing beyond replica symmetry, we analyze the fluctuations of the overlaps for an estimate of the (slow) noise affecting the retrieval of the signal, and by a stability analysis we recover the Aizenman-Contucci identities typical of glassy systems.Comment: 21 pages, 1 figur

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Archivio Istituzionale della Ricerca- Università del Salento

A Deterministic and Generalized Framework for Unsupervised Learning with Restricted Boltzmann Machines

Author: Caltagirone Francesco
Gabrié Marylou
Krzakala Florent
Manoel Andre
Tramel Eric W.
Publication venue: 'American Physical Society (APS)'
Publication date: 01/10/2018
Field of study

Restricted Boltzmann machines (RBMs) are energy-based neural-networks which are commonly used as the building blocks for deep architectures neural architectures. In this work, we derive a deterministic framework for the training, evaluation, and use of RBMs based upon the Thouless-Anderson-Palmer (TAP) mean-field approximation of widely-connected systems with weak interactions coming from spin-glass theory. While the TAP approach has been extensively studied for fully-visible binary spin systems, our construction is generalized to latent-variable models, as well as to arbitrarily distributed real-valued spin systems with bounded support. In our numerical experiments, we demonstrate the effective deterministic training of our proposed models and are able to show interesting features of unsupervised learning which could not be directly observed with sampling. Additionally, we demonstrate how to utilize our TAP-based framework for leveraging trained RBMs as joint priors in denoising problems

arXiv.org e-Print Archive

Directory of Open Access Journals

Hal-Diderot

Boosting Monte Carlo simulations of spin glasses using autoregressive neural networks

Author: McNaughton B.
Milošević M. V.
Perali A.
Pilati S.
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2020
Field of study

The autoregressive neural networks are emerging as a powerful computational tool to solve relevant problems in classical and quantum mechanics. One of their appealing functionalities is that, after they have learned a probability distribution from a dataset, they allow exact and efficient sampling of typical system configurations. Here we employ a neural autoregressive distribution estimator (NADE) to boost Markov chain Monte Carlo (MCMC) simulations of a paradigmatic classical model of spin-glass theory, namely the two-dimensional Edwards-Anderson Hamiltonian. We show that a NADE can be trained to accurately mimic the Boltzmann distribution using unsupervised learning from system configurations generated using standard MCMC algorithms. The trained NADE is then employed as smart proposal distribution for the Metropolis-Hastings algorithm. This allows us to perform efficient MCMC simulations, which provide unbiased results even if the expectation value corresponding to the probability distribution learned by the NADE is not exact. Notably, we implement a sequential tempering procedure, whereby a NADE trained at a higher temperature is iteratively employed as proposal distribution in a MCMC simulation run at a slightly lower temperature. This allows one to efficiently simulate the spin-glass model even in the low-temperature regime, avoiding the divergent correlation times that plague MCMC simulations driven by local-update algorithms. Furthermore, we show that the NADE-driven simulations quickly sample ground-state configurations, paving the way to their future utilization to tackle binary optimization problems.Comment: 13 pages, 14 figure

arXiv.org e-Print Archive

Institutional Repository Universiteit Antwerpen

Archivio istituzionale della ricerca - Università di Camerino

Quantum-Assisted Learning of Hardware-Embedded Probabilistic Graphical Models

Author: Benedetti Marcello
Biswas Rupak
Perdomo-Ortiz Alejandro
Realpe-Gómez John
Publication venue: 'American Physical Society (APS)'
Publication date: 01/11/2017
Field of study

Mainstream machine-learning techniques such as deep learning and probabilistic programming rely heavily on sampling from generally intractable probability distributions. There is increasing interest in the potential advantages of using quantum computing technologies as sampling engines to speed up these tasks or to make them more effective. However, some pressing challenges in state-of-the-art quantum annealers have to be overcome before we can assess their actual performance. The sparse connectivity, resulting from the local interaction between quantum bits in physical hardware implementations, is considered the most severe limitation to the quality of constructing powerful generative unsupervised machine-learning models. Here we use embedding techniques to add redundancy to data sets, allowing us to increase the modeling capacity of quantum annealers. We illustrate our findings by training hardware-embedded graphical models on a binarized data set of handwritten digits and two synthetic data sets in experiments with up to 940 quantum bits. Our model can be trained in quantum hardware without full knowledge of the effective parameters specifying the corresponding quantum Gibbs-like distribution; therefore, this approach avoids the need to infer the effective temperature at each iteration, speeding up learning; it also mitigates the effect of noise in the control parameters, making it robust to deviations from the reference Gibbs distribution. Our approach demonstrates the feasibility of using quantum annealers for implementing generative models, and it provides a suitable framework for benchmarking these quantum technologies on machine-learning-related tasks.Comment: 17 pages, 8 figures. Minor further revisions. As published in Phys. Rev.

arXiv.org e-Print Archive

Directory of Open Access Journals