79 research outputs found
Short-Term Plasticity Neurons Learning to Learn and Forget
Short-term plasticity (STP) is a mechanism that stores decaying memories in
synapses of the cerebral cortex. In computing practice, STP has been used, but
mostly in the niche of spiking neurons, even though theory predicts that it is
the optimal solution to certain dynamic tasks. Here we present a new type of
recurrent neural unit, the STP Neuron (STPN), which indeed turns out strikingly
powerful. Its key mechanism is that synapses have a state, propagated through
time by a self-recurrent connection-within-the-synapse. This formulation
enables training the plasticity with backpropagation through time, resulting in
a form of learning to learn and forget in the short term. The STPN outperforms
all tested alternatives, i.e. RNNs, LSTMs, other models with fast weights, and
differentiable plasticity. We confirm this in both supervised and reinforcement
learning (RL), and in tasks such as Associative Retrieval, Maze Exploration,
Atari video games, and MuJoCo robotics. Moreover, we calculate that, in
neuromorphic or biological circuits, the STPN minimizes energy consumption
across models, as it depresses individual synapses dynamically. Based on these,
biological STP may have been a strong evolutionary attractor that maximizes
both efficiency and computational power. The STPN now brings these neuromorphic
advantages also to a broad spectrum of machine learning practice. Code is
available at https://github.com/NeuromorphicComputing/stpnComment: Accepted at ICML 202
Exploring Neuromodulatory Systems for Dynamic Learning
In a continual learning system, the network has to dynamically learn new tasks from few samples throughout its lifetime. It is observed that neuromodulation acts as a key factor in continual and dynamic learning in the central nervous system. In this work, the neuromodulatory plasticity is embedded with dynamic learning architectures. The network has an inbuilt modulatory unit that regulates learning depending on the context and the internal state of the system, thus rendering the networks with the ability to self modify their weights. In one of the proposed architectures, ModNet, a modulatory layer is introduced in a random projection framework. This layer modulates the weights of the output layer neurons in tandem with hebbian learning.
Moreover, to explore modulatory mechanisms in conjunction with backpropagation in deeper networks, a modulatory trace learning rule is introduced. The proposed learning rule, uses a time dependent trace to automatically modify the synaptic connections as a function of ongoing states and activations. The trace itself is updated via simple plasticity rules thus reducing the demand on resources. A digital architecture is proposed for ModNet, with on-device learning and resource sharing, to facilitate the efficacy of dynamic learning on the edge.
The proposed modulatory learning architecture and learning rules demonstrate the ability to learn from few samples, train quickly, and perform one shot image classification in a computationally efficient manner. The ModNet architecture achieves an accuracy of ∼91% for image classification on the MNIST dataset while training for just 2 epochs. The deeper network with modulatory trace achieves an average accuracy of 98.8%±1.16 on the omniglot dataset for five-way one-shot image classification task. In general, incorporating neuromodulation in deep neural networks shows promise for energy and resource efficient lifelong learning systems
Recommended from our members
Synaptic plasticity and memory addressing in biological and artificial neural networks
Biological brains are composed of neurons, interconnected by synapses to create large complex networks. Learning and memory occur, in large part, due to synaptic plasticity -- modifications in the efficacy of information transmission through these synaptic connections. Artificial neural networks model these with neural "units" which communicate through synaptic weights. Models of learning and memory propose synaptic plasticity rules that describe and predict the weight modifications. An equally important but under-evaluated question is the selection of \textit{which} synapses should be updated in response to a memory event. In this work, we attempt to separate the questions of synaptic plasticity from that of memory addressing.
Chapter 1 provides an overview of the problem of memory addressing and a summary of the solutions that have been considered in computational neuroscience and artificial intelligence, as well as those that may exist in biology. Chapter 2 presents in detail a solution to memory addressing and synaptic plasticity in the context of familiarity detection, suggesting strong feedforward weights and anti-Hebbian plasticity as the respective mechanisms. Chapter 3 proposes a model of recall, with storage performed by addressing through local third factors and neo-Hebbian plasticity, and retrieval by content-based addressing. In Chapter 4, we consider the problem of concurrent memory consolidation and memorization. Both storage and retrieval are performed by content-based addressing, but the plasticity rule itself is implemented by gradient descent, modulated according to whether an item should be stored in a distributed manner or memorized verbatim. However, the classical method for computing gradients in recurrent neural networks, backpropagation through time, is generally considered unbiological. In Chapter 5 we suggest a more realistic implementation through an approximation of recurrent backpropagation.
Taken together, these results propose a number of potential mechanisms for memory storage and retrieval, each of which separates the mechanism of synaptic updating -- plasticity -- from that of synapse selection -- addressing. Explicit studies of memory addressing may find applications not only in artificial intelligence but also in biology. In artificial networks, for example, selectively updating memories in large language models can help improve user privacy and security. In biological ones, understanding memory addressing can help with health outcomes and treating memory-based illnesses such as Alzheimers or PTSD
Oscillatory neural network learning for pattern recognition:an on-chip learning perspective and implementation
In the human brain, learning is continuous, while currently in AI, learning algorithms are pre-trained, making the model non-evolutive and predetermined. However, even in AI models, environment and input data change over time. Thus, there is a need to study continual learning algorithms. In particular, there is a need to investigate how to implement such continual learning algorithms on-chip. In this work, we focus on Oscillatory Neural Networks (ONNs), a neuromorphic computing paradigm performing auto-associative memory tasks, like Hopfield Neural Networks (HNNs). We study the adaptability of the HNN unsupervised learning rules to on-chip learning with ONN. In addition, we propose a first solution to implement unsupervised on-chip learning using a digital ONN design. We show that the architecture enables efficient ONN on-chip learning with Hebbian and Storkey learning rules in hundreds of microseconds for networks with up to 35 fully-connected digital oscillators.</p
Bayesian Continual Learning via Spiking Neural Networks
Among the main features of biological intelligence are energy efficiency,
capacity for continual adaptation, and risk management via uncertainty
quantification. Neuromorphic engineering has been thus far mostly driven by the
goal of implementing energy-efficient machines that take inspiration from the
time-based computing paradigm of biological brains. In this paper, we take
steps towards the design of neuromorphic systems that are capable of adaptation
to changing learning tasks, while producing well-calibrated uncertainty
quantification estimates. To this end, we derive online learning rules for
spiking neural networks (SNNs) within a Bayesian continual learning framework.
In it, each synaptic weight is represented by parameters that quantify the
current epistemic uncertainty resulting from prior knowledge and observed data.
The proposed online rules update the distribution parameters in a streaming
fashion as data are observed. We instantiate the proposed approach for both
real-valued and binary synaptic weights. Experimental results using Intel's
Lava platform show the merits of Bayesian over frequentist learning in terms of
capacity for adaptation and uncertainty quantification.Comment: Accepted for publication in Frontiers in Computational Neuroscienc
Dimensions of Timescales in Neuromorphic Computing Systems
This article is a public deliverable of the EU project "Memory technologies
with multi-scale time constants for neuromorphic architectures" (MeMScales,
https://memscales.eu, Call ICT-06-2019 Unconventional Nanoelectronics, project
number 871371). This arXiv version is a verbatim copy of the deliverable
report, with administrative information stripped. It collects a wide and varied
assortment of phenomena, models, research themes and algorithmic techniques
that are connected with timescale phenomena in the fields of computational
neuroscience, mathematics, machine learning and computer science, with a bias
toward aspects that are relevant for neuromorphic engineering. It turns out
that this theme is very rich indeed and spreads out in many directions which
defy a unified treatment. We collected several dozens of sub-themes, each of
which has been investigated in specialized settings (in the neurosciences,
mathematics, computer science and machine learning) and has been documented in
its own body of literature. The more we dived into this diversity, the more it
became clear that our first effort to compose a survey must remain sketchy and
partial. We conclude with a list of insights distilled from this survey which
give general guidelines for the design of future neuromorphic systems
- …