114 research outputs found
Alternating Synthetic and Real Gradients for Neural Language Modeling
Training recurrent neural networks (RNNs) with backpropagation through time
(BPTT) has known drawbacks such as being difficult to capture longterm
dependencies in sequences. Successful alternatives to BPTT have not yet been
discovered. Recently, BP with synthetic gradients by a decoupled neural
interface module has been proposed to replace BPTT for training RNNs. On the
other hand, it has been shown that the representations learned with synthetic
and real gradients are different though they are functionally identical. In
this project, we explore ways of combining synthetic and real gradients with
application to neural language modeling tasks. Empirically, we demonstrate the
effectiveness of alternating training with synthetic and real gradients after
periodic warm restarts on language modeling tasks
Contrastive Learning for Lifted Networks
In this work we address supervised learning of neural networks via lifted
network formulations. Lifted networks are interesting because they allow
training on massively parallel hardware and assign energy models to
discriminatively trained neural networks. We demonstrate that the training
methods for lifted networks proposed in the literature have significant
limitations and show how to use a contrastive loss to address those
limitations. We demonstrate that this contrastive training approximates
back-propagation in theory and in practice and that it is superior to the
training objective regularly used for lifted networks.Comment: 9 pages, BMVC 201
Learning Spiking Neural Systems with the Event-Driven Forward-Forward Process
We develop a novel credit assignment algorithm for information processing
with spiking neurons without requiring feedback synapses. Specifically, we
propose an event-driven generalization of the forward-forward and the
predictive forward-forward learning processes for a spiking neural system that
iteratively processes sensory input over a stimulus window. As a result, the
recurrent circuit computes the membrane potential of each neuron in each layer
as a function of local bottom-up, top-down, and lateral signals, facilitating a
dynamic, layer-wise parallel form of neural computation. Unlike spiking neural
coding, which relies on feedback synapses to adjust neural electrical activity,
our model operates purely online and forward in time, offering a promising way
to learn distributed representations of sensory data patterns with temporal
spike signals. Notably, our experimental results on several pattern datasets
demonstrate that the even-driven forward-forward (ED-FF) framework works well
for training a dynamic recurrent spiking system capable of both classification
and reconstruction
- …