Search CORE

12,142 research outputs found

Continuous Restricted Boltzmann Machines

Author: Harrison Robert W.
Publication venue: ScholarWorks @ Georgia State University
Publication date: 01/01/2018
Field of study

Restricted Boltzmann machines are a generative neural network. They summarize their input data to build a probabilistic model that can then be used to reconstruct missing data or to classify new data. Unlike discrete Boltzmann machines, where the data are mapped to the space of integers or bitstrings, continuous Boltzmann machines directly use floating point numbers and therefore represent the data with higher fidelity. The primary limitation in using Boltzmann machines for big-data problems is the efficiency of the training algorithm. This paper describes an efficient deterministic algorithm for training continuous machines

ScholarWorks @ Georgia State University

Neural Networks retrieving Boolean patterns in a sea of Gaussian ones

Author: Agliari Elena
Barra Adriano
Longo Chiara
Tantari Daniele
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Restricted Boltzmann Machines are key tools in Machine Learning and are described by the energy function of bipartite spin-glasses. From a statistical mechanical perspective, they share the same Gibbs measure of Hopfield networks for associative memory. In this equivalence, weights in the former play as patterns in the latter. As Boltzmann machines usually require real weights to be trained with gradient descent like methods, while Hopfield networks typically store binary patterns to be able to retrieve, the investigation of a mixed Hebbian network, equipped with both real (e.g., Gaussian) and discrete (e.g., Boolean) patterns naturally arises. We prove that, in the challenging regime of a high storage of real patterns, where retrieval is forbidden, an extra load of Boolean patterns can still be retrieved, as long as the ratio among the overall load and the network size does not exceed a critical threshold, that turns out to be the same of the standard Amit-Gutfreund-Sompolinsky theory. Assuming replica symmetry, we study the case of a low load of Boolean patterns combining the stochastic stability and Hamilton-Jacobi interpolating techniques. The result can be extended to the high load by a non rigorous but standard replica computation argument.Comment: 16 pages, 1 figur

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della Ricerca - Scuola Normale Superiore

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Archivio della ricerca- Università di Roma La Sapienza

Archivio Istituzionale della Ricerca- Università del Salento

When Does a Mixture of Products Contain a Product of Mixtures?

Author: Montufar Guido F.
Morton Jason
Publication venue
Publication date: 01/01/2014
Field of study

We derive relations between theoretical properties of restricted Boltzmann machines (RBMs), popular machine learning models which form the building blocks of deep learning models, and several natural notions from discrete mathematics and convex geometry. We give implications and equivalences relating RBM-representable probability distributions, perfectly reconstructible inputs, Hamming modes, zonotopes and zonosets, point configurations in hyperplane arrangements, linear threshold codes, and multi-covering numbers of hypercubes. As a motivating application, we prove results on the relative representational power of mixtures of product distributions and products of mixtures of pairs of product distributions (RBMs) that formally justify widely held intuitions about distributed representations. In particular, we show that a mixture of products requiring an exponentially larger number of parameters is needed to represent the probability distributions which can be obtained as products of mixtures.Comment: 32 pages, 6 figures, 2 table

arXiv.org e-Print Archive

CiteSeerX

eScholarship - University of California

Event-Driven Contrastive Divergence for Spiking Neuromorphic Systems

Author: Cauwenberghs Gert
Das Srinjoy
Kreutz-Delgado Kenneth
Neftci Emre
Pedroni Bruno
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2013
Field of study

Restricted Boltzmann Machines (RBMs) and Deep Belief Networks have been demonstrated to perform efficiently in a variety of applications, such as dimensionality reduction, feature learning, and classification. Their implementation on neuromorphic hardware platforms emulating large-scale networks of spiking neurons can have significant advantages from the perspectives of scalability, power dissipation and real-time interfacing with the environment. However the traditional RBM architecture and the commonly used training algorithm known as Contrastive Divergence (CD) are based on discrete updates and exact arithmetics which do not directly map onto a dynamical neural substrate. Here, we present an event-driven variation of CD to train a RBM constructed with Integrate & Fire (I&F) neurons, that is constrained by the limitations of existing and near future neuromorphic hardware platforms. Our strategy is based on neural sampling, which allows us to synthesize a spiking neural network that samples from a target Boltzmann distribution. The recurrent activity of the network replaces the discrete steps of the CD algorithm, while Spike Time Dependent Plasticity (STDP) carries out the weight updates in an online, asynchronous fashion. We demonstrate our approach by training an RBM composed of leaky I&F neurons with STDP synapses to learn a generative model of the MNIST hand-written digit dataset, and by testing it in recognition, generation and cue integration tasks. Our results contribute to a machine learning-driven approach for synthesizing networks of spiking neurons capable of carrying out practical, high-level functionality.Comment: (Under review

arXiv.org e-Print Archive

Crossref

Frontiers - Publisher Connector

PubMed Central

eScholarship - University of California

Universal Approximation Depth and Errors of Narrow Belief Networks with Discrete Units

Author: Montúfar Guido F.
Publication venue
Publication date: 01/01/2014
Field of study

We generalize recent theoretical work on the minimal number of layers of narrow deep belief networks that can approximate any probability distribution on the states of their visible units arbitrarily well. We relax the setting of binary units (Sutskever and Hinton, 2008; Le Roux and Bengio, 2008, 2010; Mont\'ufar and Ay, 2011) to units with arbitrary finite state spaces, and the vanishing approximation error to an arbitrary approximation error tolerance. For example, we show that a

q

-ary deep belief network with

L\geq 2+\frac{q^{\lceil m-\delta \rceil}-1}{q-1}

layers of width

n \leq m + \log_q(m) + 1

for some

m\in \mathbb{N}

can approximate any probability distribution on

\{0,1,\ldots,q-1\}^n

without exceeding a Kullback-Leibler divergence of

\delta

. Our analysis covers discrete restricted Boltzmann machines and na\"ive Bayes models as special cases.Comment: 19 pages, 5 figures, 1 tabl

arXiv.org e-Print Archive

CiteSeerX

eScholarship - University of California

Discriminative conditional restricted Boltzmann machine for discrete choice and latent variable modelling

Author: Bilodeau Guillaume-Alexandre
Farooq Bilal
Wong Melvin
Publication venue: 'Elsevier BV'
Publication date: 01/06/2017
Field of study

Conventional methods of estimating latent behaviour generally use attitudinal questions which are subjective and these survey questions may not always be available. We hypothesize that an alternative approach can be used for latent variable estimation through an undirected graphical models. For instance, non-parametric artificial neural networks. In this study, we explore the use of generative non-parametric modelling methods to estimate latent variables from prior choice distribution without the conventional use of measurement indicators. A restricted Boltzmann machine is used to represent latent behaviour factors by analyzing the relationship information between the observed choices and explanatory variables. The algorithm is adapted for latent behaviour analysis in discrete choice scenario and we use a graphical approach to evaluate and understand the semantic meaning from estimated parameter vector values. We illustrate our methodology on a financial instrument choice dataset and perform statistical analysis on parameter sensitivity and stability. Our findings show that through non-parametric statistical tests, we can extract useful latent information on the behaviour of latent constructs through machine learning methods and present strong and significant influence on the choice process. Furthermore, our modelling framework shows robustness in input variability through sampling and validation

arXiv.org e-Print Archive

PolyPublie