503 research outputs found
Contrastive Hebbian Learning with Random Feedback Weights
Neural networks are commonly trained to make predictions through learning
algorithms. Contrastive Hebbian learning, which is a powerful rule inspired by
gradient backpropagation, is based on Hebb's rule and the contrastive
divergence algorithm. It operates in two phases, the forward (or free) phase,
where the data are fed to the network, and a backward (or clamped) phase, where
the target signals are clamped to the output layer of the network and the
feedback signals are transformed through the transpose synaptic weight
matrices. This implies symmetries at the synaptic level, for which there is no
evidence in the brain. In this work, we propose a new variant of the algorithm,
called random contrastive Hebbian learning, which does not rely on any synaptic
weights symmetries. Instead, it uses random matrices to transform the feedback
signals during the clamped phase, and the neural dynamics are described by
first order non-linear differential equations. The algorithm is experimentally
verified by solving a Boolean logic task, classification tasks (handwritten
digits and letters), and an autoencoding task. This article also shows how the
parameters affect learning, especially the random matrices. We use the
pseudospectra analysis to investigate further how random matrices impact the
learning process. Finally, we discuss the biological plausibility of the
proposed algorithm, and how it can give rise to better computational models for
learning
What Drives Swap Spreads, Credit or Liquidity?
This paper investigates the determinants of swap spreads. Compared with previous work done in this area, such as the seminal paper by Duffie and Singleton (1997), the paper includes daily credit spreads data in the time series framework. The issue is whether “liquidity” or “credit” (or both) is the main determinant of swap spreads dynamics. Our results agree with the prevailing view among swap traders that swap spreads are mainly an indicator of “market liquidity”. However, the dynamics are influenced significantly by “credit” over longer horizons, although credit is not the main driving force. LIBOR rate dynamics seem to play a relatively minor role in this setting.
Disturbing Extremal Behavior of Spot Rate Dynamics
This paper presents a study of extreme interest rate movements in the U.S. Federal Funds market over almost a half century of daily observations from the mid 1950s through the end of 2000. We analyze the fluctuations of the maximal and minimal changes in short term interest rates and test the significance of time-varying paths followed by the mean and volatility of extremes. We formally determine the relevance of introducing trend and serial correlation in the mean, and of incorporating the level and GARCH effects in the volatility of extreme changes in the federal funds rate. The empirical findings indicate the existence of volatility clustering in the standard deviation of extremes, and a significantly positive relationship between the level and the volatility of extremes. The results point to the presence of an autoregressive process in the means of both local maxima and local minima values. The paper proposes a conditional extreme value approach to calculating value at risk by specifying the location and scale parameters of the generalized Pareto distribution as a function of past information. Based on the estimated VaR thresholds, the statistical theory of extremes is found to provide more accurate estimates of the rate of occurrence and the size of extreme observations.extreme value theory, volatility, interest rates, value at risk
On-chip Few-shot Learning with Surrogate Gradient Descent on a Neuromorphic Processor
Recent work suggests that synaptic plasticity dynamics in biological models
of neurons and neuromorphic hardware are compatible with gradient-based
learning (Neftci et al., 2019). Gradient-based learning requires iterating
several times over a dataset, which is both time-consuming and constrains the
training samples to be independently and identically distributed. This is
incompatible with learning systems that do not have boundaries between training
and inference, such as in neuromorphic hardware. One approach to overcome these
constraints is transfer learning, where a portion of the network is pre-trained
and mapped into hardware and the remaining portion is trained online. Transfer
learning has the advantage that pre-training can be accelerated offline if the
task domain is known, and few samples of each class are sufficient for learning
the target task at reasonable accuracies. Here, we demonstrate on-line
surrogate gradient few-shot learning on Intel's Loihi neuromorphic research
processor using features pre-trained with spike-based gradient
backpropagation-through-time. Our experimental results show that the Loihi chip
can learn gestures online using a small number of shots and achieve results
that are comparable to the models simulated on a conventional processor
Error-triggered Three-Factor Learning Dynamics for Crossbar Arrays
Recent breakthroughs suggest that local, approximate gradient descent
learning is compatible with Spiking Neural Networks (SNNs). Although SNNs can
be scalably implemented using neuromorphic VLSI, an architecture that can learn
in-situ as accurately as conventional processors is still missing. Here, we
propose a subthreshold circuit architecture designed through insights obtained
from machine learning and computational neuroscience that could achieve such
accuracy. Using a surrogate gradient learning framework, we derive local,
error-triggered learning dynamics compatible with crossbar arrays and the
temporal dynamics of SNNs. The derivation reveals that circuits used for
inference and training dynamics can be shared, which simplifies the circuit and
suppresses the effects of fabrication mismatch. We present SPICE simulations on
XFAB 180nm process, as well as large-scale simulations of the spiking neural
networks on event-based benchmarks, including a gesture recognition task. Our
results show that the number of updates can be reduced hundred-fold compared to
the standard rule while achieving performances that are on par with the
state-of-the-art
Event-Driven Contrastive Divergence for Spiking Neuromorphic Systems
Restricted Boltzmann Machines (RBMs) and Deep Belief Networks have been
demonstrated to perform efficiently in a variety of applications, such as
dimensionality reduction, feature learning, and classification. Their
implementation on neuromorphic hardware platforms emulating large-scale
networks of spiking neurons can have significant advantages from the
perspectives of scalability, power dissipation and real-time interfacing with
the environment. However the traditional RBM architecture and the commonly used
training algorithm known as Contrastive Divergence (CD) are based on discrete
updates and exact arithmetics which do not directly map onto a dynamical neural
substrate. Here, we present an event-driven variation of CD to train a RBM
constructed with Integrate & Fire (I&F) neurons, that is constrained by the
limitations of existing and near future neuromorphic hardware platforms. Our
strategy is based on neural sampling, which allows us to synthesize a spiking
neural network that samples from a target Boltzmann distribution. The recurrent
activity of the network replaces the discrete steps of the CD algorithm, while
Spike Time Dependent Plasticity (STDP) carries out the weight updates in an
online, asynchronous fashion. We demonstrate our approach by training an RBM
composed of leaky I&F neurons with STDP synapses to learn a generative model of
the MNIST hand-written digit dataset, and by testing it in recognition,
generation and cue integration tasks. Our results contribute to a machine
learning-driven approach for synthesizing networks of spiking neurons capable
of carrying out practical, high-level functionality.Comment: (Under review
Inherent Weight Normalization in Stochastic Neural Networks
Multiplicative stochasticity such as Dropout improves the robustness and
generalizability of deep neural networks. Here, we further demonstrate that
always-on multiplicative stochasticity combined with simple threshold neurons
are sufficient operations for deep neural networks. We call such models Neural
Sampling Machines (NSM). We find that the probability of activation of the NSM
exhibits a self-normalizing property that mirrors Weight Normalization, a
previously studied mechanism that fulfills many of the features of Batch
Normalization in an online fashion. The normalization of activities during
training speeds up convergence by preventing internal covariate shift caused by
changes in the input distribution. The always-on stochasticity of the NSM
confers the following advantages: the network is identical in the inference and
learning phases, making the NSM suitable for online learning, it can exploit
stochasticity inherent to a physical substrate such as analog non-volatile
memories for in-memory computing, and it is suitable for Monte Carlo sampling,
while requiring almost exclusively addition and comparison operations. We
demonstrate NSMs on standard classification benchmarks (MNIST and CIFAR) and
event-based classification benchmarks (N-MNIST and DVS Gestures). Our results
show that NSMs perform comparably or better than conventional artificial neural
networks with the same architecture
Stochastic Synapses Enable Efficient Brain-Inspired Learning Machines
Recent studies have shown that synaptic unreliability is a robust and
sufficient mechanism for inducing the stochasticity observed in cortex. Here,
we introduce Synaptic Sampling Machines, a class of neural network models that
uses synaptic stochasticity as a means to Monte Carlo sampling and unsupervised
learning. Similar to the original formulation of Boltzmann machines, these
models can be viewed as a stochastic counterpart of Hopfield networks, but
where stochasticity is induced by a random mask over the connections. Synaptic
stochasticity plays the dual role of an efficient mechanism for sampling, and a
regularizer during learning akin to DropConnect. A local synaptic plasticity
rule implementing an event-driven form of contrastive divergence enables the
learning of generative models in an on-line fashion. Synaptic sampling machines
perform equally well using discrete-timed artificial units (as in Hopfield
networks) or continuous-timed leaky integrate & fire neurons. The learned
representations are remarkably sparse and robust to reductions in bit precision
and synapse pruning: removal of more than 75% of the weakest connections
followed by cursory re-learning causes a negligible performance loss on
benchmark classification tasks. The spiking neuron-based synaptic sampling
machines outperform existing spike-based unsupervised learners, while
potentially offering substantial advantages in terms of power and complexity,
and are thus promising models for on-line learning in brain-inspired hardware
Conversion of Artificial Recurrent Neural Networks to Spiking Neural Networks for Low-power Neuromorphic Hardware
In recent years the field of neuromorphic low-power systems that consume
orders of magnitude less power gained significant momentum. However, their
wider use is still hindered by the lack of algorithms that can harness the
strengths of such architectures. While neuromorphic adaptations of
representation learning algorithms are now emerging, efficient processing of
temporal sequences or variable length-inputs remain difficult. Recurrent neural
networks (RNN) are widely used in machine learning to solve a variety of
sequence learning tasks. In this work we present a train-and-constrain
methodology that enables the mapping of machine learned (Elman) RNNs on a
substrate of spiking neurons, while being compatible with the capabilities of
current and near-future neuromorphic systems. This "train-and-constrain" method
consists of first training RNNs using backpropagation through time, then
discretizing the weights and finally converting them to spiking RNNs by
matching the responses of artificial neurons with those of the spiking neurons.
We demonstrate our approach by mapping a natural language processing task
(question classification), where we demonstrate the entire mapping process of
the recurrent layer of the network on IBM's Neurosynaptic System "TrueNorth", a
spike-based digital neuromorphic hardware architecture. TrueNorth imposes
specific constraints on connectivity, neural and synaptic parameters. To
satisfy these constraints, it was necessary to discretize the synaptic weights
and neural activities to 16 levels, and to limit fan-in to 64 inputs. We find
that short synaptic delays are sufficient to implement the dynamical (temporal)
aspect of the RNN in the question classification task. The hardware-constrained
model achieved 74% accuracy in question classification while using less than
0.025% of the cores on one TrueNorth chip, resulting in an estimated power
consumption of ~17 uW
- …