Search CORE

5,583 research outputs found

Emergence of Compositional Representations in Restricted Boltzmann Machines

Author: Monasson Rémi
Tubiana Jérôme
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2017
Field of study

Extracting automatically the complex set of features composing real high-dimensional data is crucial for achieving high performance in machine--learning tasks. Restricted Boltzmann Machines (RBM) are empirically known to be efficient for this purpose, and to be able to generate distributed and graded representations of the data. We characterize the structural conditions (sparsity of the weights, low effective temperature, nonlinearities in the activation functions of hidden units, and adaptation of fields maintaining the activity in the visible layer) allowing RBM to operate in such a compositional phase. Evidence is provided by the replica analysis of an adequate statistical ensemble of random RBMs and by RBM trained on the handwritten digits dataset MNIST.Comment: Supplementary material available at the authors' webpag

arXiv.org e-Print Archive

Dreaming of atmospheres

Author: Waldmann I. P.
Publication venue: 'American Astronomical Society'
Publication date: 11/02/2016
Field of study

Here we introduce the RobERt (Robotic Exoplanet Recognition) algorithm for the classification of exoplanetary emission spectra. Spectral retrievals of exoplanetary atmospheres frequently requires the preselection of molecular/atomic opacities to be defined by the user. In the era of open-source, automated and self-sufficient retrieval algorithms, manual input should be avoided. User dependent input could, in worst case scenarios, lead to incomplete models and biases in the retrieval. The RobERt algorithm is based on deep belief neural (DBN) networks trained to accurately recognise molecular signatures for a wide range of planets, atmospheric thermal profiles and compositions. Reconstructions of the learned features, also referred to as `dreams' of the network, indicate good convergence and an accurate representation of molecular features in the DBN. Using these deep neural networks, we work towards retrieval algorithms that themselves understand the nature of the observed spectra, are able to learn from current and past data and make sensible qualitative preselections of atmospheric opacities to be used for the quantitative stage of the retrieval process.Comment: ApJ accepte

arXiv.org e-Print Archive

UCL Discovery

Practical recommendations for gradient-based training of deep architectures

Author: Bengio Yoshua
Publication venue
Publication date: 16/09/2012
Field of study

Learning algorithms related to artificial neural networks and in particular for Deep Learning may seem to involve many bells and whistles, called hyper-parameters. This chapter is meant as a practical guide with recommendations for some of the most commonly used hyper-parameters, in particular in the context of learning algorithms based on back-propagated gradient and gradient-based optimization. It also discusses how to deal with the fact that more interesting results can be obtained when allowing one to adjust many hyper-parameters. Overall, it describes elements of the practice used to successfully and efficiently train and debug large-scale and often deep multi-layer neural networks. It closes with open questions about the training difficulties observed with deeper architectures

arXiv.org e-Print Archive

CiteSeerX

A Deterministic and Generalized Framework for Unsupervised Learning with Restricted Boltzmann Machines

Author: Caltagirone Francesco
Gabrié Marylou
Krzakala Florent
Manoel Andre
Tramel Eric W.
Publication venue: 'American Physical Society (APS)'
Publication date: 01/10/2018
Field of study

Restricted Boltzmann machines (RBMs) are energy-based neural-networks which are commonly used as the building blocks for deep architectures neural architectures. In this work, we derive a deterministic framework for the training, evaluation, and use of RBMs based upon the Thouless-Anderson-Palmer (TAP) mean-field approximation of widely-connected systems with weak interactions coming from spin-glass theory. While the TAP approach has been extensively studied for fully-visible binary spin systems, our construction is generalized to latent-variable models, as well as to arbitrarily distributed real-valued spin systems with bounded support. In our numerical experiments, we demonstrate the effective deterministic training of our proposed models and are able to show interesting features of unsupervised learning which could not be directly observed with sampling. Additionally, we demonstrate how to utilize our TAP-based framework for leveraging trained RBMs as joint priors in denoising problems

arXiv.org e-Print Archive

Directory of Open Access Journals

Hal-Diderot