5,583 research outputs found
Emergence of Compositional Representations in Restricted Boltzmann Machines
Extracting automatically the complex set of features composing real
high-dimensional data is crucial for achieving high performance in
machine--learning tasks. Restricted Boltzmann Machines (RBM) are empirically
known to be efficient for this purpose, and to be able to generate distributed
and graded representations of the data. We characterize the structural
conditions (sparsity of the weights, low effective temperature, nonlinearities
in the activation functions of hidden units, and adaptation of fields
maintaining the activity in the visible layer) allowing RBM to operate in such
a compositional phase. Evidence is provided by the replica analysis of an
adequate statistical ensemble of random RBMs and by RBM trained on the
handwritten digits dataset MNIST.Comment: Supplementary material available at the authors' webpag
Dreaming of atmospheres
Here we introduce the RobERt (Robotic Exoplanet Recognition) algorithm for
the classification of exoplanetary emission spectra. Spectral retrievals of
exoplanetary atmospheres frequently requires the preselection of
molecular/atomic opacities to be defined by the user. In the era of
open-source, automated and self-sufficient retrieval algorithms, manual input
should be avoided. User dependent input could, in worst case scenarios, lead to
incomplete models and biases in the retrieval. The RobERt algorithm is based on
deep belief neural (DBN) networks trained to accurately recognise molecular
signatures for a wide range of planets, atmospheric thermal profiles and
compositions. Reconstructions of the learned features, also referred to as
`dreams' of the network, indicate good convergence and an accurate
representation of molecular features in the DBN. Using these deep neural
networks, we work towards retrieval algorithms that themselves understand the
nature of the observed spectra, are able to learn from current and past data
and make sensible qualitative preselections of atmospheric opacities to be used
for the quantitative stage of the retrieval process.Comment: ApJ accepte
Practical recommendations for gradient-based training of deep architectures
Learning algorithms related to artificial neural networks and in particular
for Deep Learning may seem to involve many bells and whistles, called
hyper-parameters. This chapter is meant as a practical guide with
recommendations for some of the most commonly used hyper-parameters, in
particular in the context of learning algorithms based on back-propagated
gradient and gradient-based optimization. It also discusses how to deal with
the fact that more interesting results can be obtained when allowing one to
adjust many hyper-parameters. Overall, it describes elements of the practice
used to successfully and efficiently train and debug large-scale and often deep
multi-layer neural networks. It closes with open questions about the training
difficulties observed with deeper architectures
A Deterministic and Generalized Framework for Unsupervised Learning with Restricted Boltzmann Machines
Restricted Boltzmann machines (RBMs) are energy-based neural-networks which
are commonly used as the building blocks for deep architectures neural
architectures. In this work, we derive a deterministic framework for the
training, evaluation, and use of RBMs based upon the Thouless-Anderson-Palmer
(TAP) mean-field approximation of widely-connected systems with weak
interactions coming from spin-glass theory. While the TAP approach has been
extensively studied for fully-visible binary spin systems, our construction is
generalized to latent-variable models, as well as to arbitrarily distributed
real-valued spin systems with bounded support. In our numerical experiments, we
demonstrate the effective deterministic training of our proposed models and are
able to show interesting features of unsupervised learning which could not be
directly observed with sampling. Additionally, we demonstrate how to utilize
our TAP-based framework for leveraging trained RBMs as joint priors in
denoising problems
- …