108 research outputs found
Training Dynamic Exponential Family Models with Causal and Lateral Dependencies for Generalized Neuromorphic Computing
Neuromorphic hardware platforms, such as Intel's Loihi chip, support the
implementation of Spiking Neural Networks (SNNs) as an energy-efficient
alternative to Artificial Neural Networks (ANNs). SNNs are networks of neurons
with internal analogue dynamics that communicate by means of binary time
series. In this work, a probabilistic model is introduced for a generalized
set-up in which the synaptic time series can take values in an arbitrary
alphabet and are characterized by both causal and instantaneous statistical
dependencies. The model, which can be considered as an extension of exponential
family harmoniums to time series, is introduced by means of a hybrid
directed-undirected graphical representation. Furthermore, distributed learning
rules are derived for Maximum Likelihood and Bayesian criteria under the
assumption of fully observed time series in the training set.Comment: Published in IEEE ICASSP 2019. Author's Accepted Manuscrip
Attention in a family of Boltzmann machines emerging from modern Hopfield networks
Hopfield networks and Boltzmann machines (BMs) are fundamental energy-based
neural network models. Recent studies on modern Hopfield networks have broaden
the class of energy functions and led to a unified perspective on general
Hopfield networks including an attention module. In this letter, we consider
the BM counterparts of modern Hopfield networks using the associated energy
functions, and study their salient properties from a trainability perspective.
In particular, the energy function corresponding to the attention module
naturally introduces a novel BM, which we refer to as attentional BM (AttnBM).
We verify that AttnBM has a tractable likelihood function and gradient for a
special case and is easy to train. Moreover, we reveal the hidden connections
between AttnBM and some single-layer models, namely the Gaussian--Bernoulli
restricted BM and denoising autoencoder with softmax units. We also investigate
BMs introduced by other energy functions, and in particular, observe that the
energy function of dense associative memory models gives BMs belonging to
Exponential Family Harmoniums.Comment: 12 pages, 1 figur
Implementing Bayesian Inference with Neural Networks
Embodied agents, be they animals or robots, acquire information about the world through their senses. Embodied agents, however, do not simply lose this information once it passes by, but rather process and store it for future use. The most general theory of how an agent can combine stored knowledge with new observations is Bayesian inference. In this dissertation I present a theory of how embodied agents can learn to implement Bayesian inference with neural networks.
By neural network I mean both artificial and biological neural networks, and in my dissertation I address both kinds. On one hand, I develop theory for implementing Bayesian inference in deep generative models, and I show how to train multilayer perceptrons to compute approximate predictions for Bayesian filtering. On the other hand, I show that several models in computational neuroscience are special cases of the general theory that I develop in this dissertation, and I use this theory to model and explain several phenomena in neuroscience. The key contributions of this dissertation can be summarized as follows:
- I develop a class of graphical model called nth-order harmoniums. An nth-order harmonium is an n-tuple of random variables, where the conditional distribution of each variable given all the others is always an element of the same exponential family. I show that harmoniums have a recursive structure which allows them to be analyzed at coarser and finer levels of detail.
- I define a class of harmoniums called rectified harmoniums, which are constrained to have priors which are conjugate to their posteriors. As a consequence of this, rectified harmoniums afford efficient sampling and learning.
- I develop deep harmoniums, which are harmoniums which can be represented by hierarchical, undirected graphs. I develop the theory of rectification for deep harmoniums, and develop a novel algorithm for training deep generative models.
- I show how to implement a variety of optimal and near-optimal Bayes filters by combining the solution to Bayes' rule provided by rectified harmoniums, with predictions computed by a recurrent neural network. I then show how to train a neural network to implement Bayesian filtering when the transition and emission distributions are unknown.
- I show how some well-established models of neural activity are special cases of the theory I present in this dissertation, and how these models can be generalized with the theory of rectification.
- I show how the theory that I present can model several neural phenomena including proprioception and gain-field modulation of tuning curves.
- I introduce a library for the programming language Haskell, within which I have implemented all the simulations presented in this dissertation. This library uses concepts from Riemannian geometry to provide a rigorous and efficient environment for implementing complex numerical simulations.
I also use the results presented in this dissertation to argue for the fundamental role of neural computation in embodied cognition. I argue, in other words, that before we will be able to build truly intelligent robots, we will need to truly understand biological brains
Deep Exponential Families
We describe \textit{deep exponential families} (DEFs), a class of latent
variable models that are inspired by the hidden structures used in deep neural
networks. DEFs capture a hierarchy of dependencies between latent variables,
and are easily generalized to many settings through exponential families. We
perform inference using recent "black box" variational inference techniques. We
then evaluate various DEFs on text and combine multiple DEFs into a model for
pairwise recommendation data. In an extensive study, we show that going beyond
one layer improves predictions for DEFs. We demonstrate that DEFs find
interesting exploratory structure in large data sets, and give better
predictive performance than state-of-the-art models
- …