Search CORE

2,400 research outputs found

Eligibility Traces and Plasticity on Behavioral Time Scales: Experimental Support of neoHebbian Three-Factor Learning Rules

Author: Brea Johanni
Corneil Dane
Gerstner Wulfram
Lehmann Marco
Liakoni Vasiliki
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2018
Field of study

Most elementary behaviors such as moving the arm to grasp an object or walking into the next room to explore a museum evolve on the time scale of seconds; in contrast, neuronal action potentials occur on the time scale of a few milliseconds. Learning rules of the brain must therefore bridge the gap between these two different time scales. Modern theories of synaptic plasticity have postulated that the co-activation of pre- and postsynaptic neurons sets a flag at the synapse, called an eligibility trace, that leads to a weight change only if an additional factor is present while the flag is set. This third factor, signaling reward, punishment, surprise, or novelty, could be implemented by the phasic activity of neuromodulators or specific neuronal inputs signaling special events. While the theoretical framework has been developed over the last decades, experimental evidence in support of eligibility traces on the time scale of seconds has been collected only during the last few years. Here we review, in the context of three-factor rules of synaptic plasticity, four key experiments that support the role of synaptic eligibility traces in combination with a third factor as a biological implementation of neoHebbian three-factor learning rules

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Directory of Open Access Journals

Frontiers - Publisher Connector

Reinforcement learning in populations of spiking neurons

Author: A Pouget
BB Averbeck
D Centonze
DE Rumelhart
EM Izhikevich
HS Seung
IR Fiete
JP Pfister
R Gütig
RC Foehring
Robert Urbanczik
RV Florian
S Wirth
Walter Senn
Publication venue
Publication date: 16/06/2008
Field of study

Population coding is widely regarded as a key mechanism for achieving reliable behavioral responses in the face of neuronal variability. But in standard reinforcement learning a flip-side becomes apparent. Learning slows down with increasing population size since the global reinforcement becomes less and less related to the performance of any single neuron. We show that, in contrast, learning speeds up with increasing population size if feedback about the populationresponse modulates synaptic plasticity in addition to global reinforcement. The two feedback signals (reinforcement and population-response signal) can be encoded by ambient neurotransmitter concentrations which vary slowly, yielding a fully online plasticity rule where the learning of a stimulus is interleaved with the processing of the subsequent one. The assumption of a single additional feedback mechanism therefore reconciles biological plausibility with efficient learning

Crossref

Bern Open Repository and Information System (BORIS)

Nature Precedings

Short-term plasticity as cause-effect hypothesis testing in distal reward learning

Author: Soltoggio Andrea
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/09/2014
Field of study

Asynchrony, overlaps and delays in sensory-motor signals introduce ambiguity as to which stimuli, actions, and rewards are causally related. Only the repetition of reward episodes helps distinguish true cause-effect relationships from coincidental occurrences. In the model proposed here, a novel plasticity rule employs short and long-term changes to evaluate hypotheses on cause-effect relationships. Transient weights represent hypotheses that are consolidated in long-term memory only when they consistently predict or cause future rewards. The main objective of the model is to preserve existing network topologies when learning with ambiguous information flows. Learning is also improved by biasing the exploration of the stimulus-response space towards actions that in the past occurred before rewards. The model indicates under which conditions beliefs can be consolidated in long-term memory, it suggests a solution to the plasticity-stability dilemma, and proposes an interpretation of the role of short-term plasticity.Comment: Biological Cybernetics, September 201

arXiv.org e-Print Archive

Loughborough University Institutional Repository

Crossref

Recommended from our members

Neuromodulation of Spike-Timing-Dependent Plasticity: Past, Present, and Future.

Author: Brzosko Zuzanna
Mierau Susanna B
Paulsen Ole
Publication venue: Neuron
Publication date: 21/08/2019
Field of study

Spike-timing-dependent synaptic plasticity (STDP) is a leading cellular model for behavioral learning and memory with rich computational properties. However, the relationship between the millisecond-precision spike timing required for STDP and the much slower timescales of behavioral learning is not well understood. Neuromodulation offers an attractive mechanism to connect these different timescales, and there is now strong experimental evidence that STDP is under neuromodulatory control by acetylcholine, monoamines, and other signaling molecules. Here, we review neuromodulation of STDP, the underlying mechanisms, functional implications, and possible involvement in brain disorders.BBSR

Apollo (Cambridge)

Demonstrating Advantages of Neuromorphic Computation: A Pilot Study

Author: Akos F. Kungl
Andreas Grübl
Andreas Hartel
Arthur Heimbrecht
Christian Mauch
Christian Pehle
David Stöckel
Eric Müller
Gerd Kiene
Johannes Schemmel
Karlheinz Meier
Korbinian Schreiber
Mihai A. Petrovici
Mihai A. Petrovici
Sebastian Billaudelle
Syed Ahmed Aamir
Timo Wunderlich
Yannik Stradmann
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2019
Field of study

Neuromorphic devices represent an attempt to mimic aspects of the brain's architecture and dynamics with the aim of replicating its hallmark functional capabilities in terms of computational power, robust learning and energy efficiency. We employ a single-chip prototype of the BrainScaleS 2 neuromorphic system to implement a proof-of-concept demonstration of reward-modulated spike-timing-dependent plasticity in a spiking network that learns to play the Pong video game by smooth pursuit. This system combines an electronic mixed-signal substrate for emulating neuron and synapse dynamics with an embedded digital processor for on-chip learning, which in this work also serves to simulate the virtual environment and learning agent. The analog emulation of neuronal membrane dynamics enables a 1000-fold acceleration with respect to biological real-time, with the entire chip operating on a power budget of 57mW. Compared to an equivalent simulation using state-of-the-art software, the on-chip emulation is at least one order of magnitude faster and three orders of magnitude more energy-efficient. We demonstrate how on-chip learning can mitigate the effects of fixed-pattern noise, which is unavoidable in analog substrates, while making use of temporal variability for action exploration. Learning compensates imperfections of the physical substrate, as manifested in neuronal parameter variability, by adapting synaptic weights to match respective excitability of individual neurons.Comment: Added measurements with noise in NEST simulation, add notice about journal publication. Frontiers in Neuromorphic Engineering (2019

arXiv.org e-Print Archive

Directory of Open Access Journals

Bern Open Repository and Information System (BORIS)

On-chip Few-shot Learning with Surrogate Gradient Descent on a Neuromorphic Processor

Author: Neftci Emre
Orchard Garrick
Shrestha Sumit Bam
Stewart Kenneth
Publication venue: eScholarship, University of California
Publication date: 10/10/2019
Field of study

Recent work suggests that synaptic plasticity dynamics in biological models of neurons and neuromorphic hardware are compatible with gradient-based learning (Neftci et al., 2019). Gradient-based learning requires iterating several times over a dataset, which is both time-consuming and constrains the training samples to be independently and identically distributed. This is incompatible with learning systems that do not have boundaries between training and inference, such as in neuromorphic hardware. One approach to overcome these constraints is transfer learning, where a portion of the network is pre-trained and mapped into hardware and the remaining portion is trained online. Transfer learning has the advantage that pre-training can be accelerated offline if the task domain is known, and few samples of each class are sufficient for learning the target task at reasonable accuracies. Here, we demonstrate on-line surrogate gradient few-shot learning on Intel's Loihi neuromorphic research processor using features pre-trained with spike-based gradient backpropagation-through-time. Our experimental results show that the Loihi chip can learn gestures online using a small number of shots and achieve results that are comparable to the models simulated on a conventional processor

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

Learning First-to-Spike Policies for Neuromorphic Control Using Policy Gradients

Author: Rajendran Bipin
Rosenfeld Bleema
Simeone Osvaldo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/02/2019
Field of study

Artificial Neural Networks (ANNs) are currently being used as function approximators in many state-of-the-art Reinforcement Learning (RL) algorithms. Spiking Neural Networks (SNNs) have been shown to drastically reduce the energy consumption of ANNs by encoding information in sparse temporal binary spike streams, hence emulating the communication mechanism of biological neurons. Due to their low energy consumption, SNNs are considered to be important candidates as co-processors to be implemented in mobile devices. In this work, the use of SNNs as stochastic policies is explored under an energy-efficient first-to-spike action rule, whereby the action taken by the RL agent is determined by the occurrence of the first spike among the output neurons. A policy gradient-based algorithm is derived considering a Generalized Linear Model (GLM) for spiking neurons. Experimental results demonstrate the capability of online trained SNNs as stochastic policies to gracefully trade energy consumption, as measured by the number of spikes, and control performance. Significant gains are shown as compared to the standard approach of converting an offline trained ANN into an SNN.Comment: Submitted for conference publicatio

arXiv.org e-Print Archive

Crossref

King's Research Portal