667 research outputs found
Hebbian fast plasticity and working memory
Theories and models of working memory (WM) were at least since the mid-1990s
dominated by the persistent activity hypothesis. The past decade has seen
rising concerns about the shortcomings of sustained activity as the mechanism
for short-term maintenance of WM information in the light of accumulating
experimental evidence for so-called activity-silent WM and the fundamental
difficulty in explaining robust multi-item WM. In consequence, alternative
theories are now explored mostly in the direction of fast synaptic plasticity
as the underlying mechanism.The question of non-Hebbian vs Hebbian synaptic
plasticity emerges naturally in this context. In this review we focus on fast
Hebbian plasticity and trace the origins of WM theories and models building on
this form of associative learning.Comment: 12 pages, 2 figures, 1 box, submitte
Benchmarking Hebbian learning rules for associative memory
Associative memory or content addressable memory is an important component
function in computer science and information processing and is a key concept in
cognitive and computational brain science. Many different neural network
architectures and learning rules have been proposed to model associative memory
of the brain while investigating key functions like pattern completion and
rivalry, noise reduction, and storage capacity. A less investigated but
important function is prototype extraction where the training set comprises
pattern instances generated by distorting prototype patterns and the task of
the trained network is to recall the correct prototype pattern given a new
instance. In this paper we characterize these different aspects of associative
memory performance and benchmark six different learning rules on storage
capacity and prototype extraction. We consider only models with Hebbian
plasticity that operate on sparse distributed representations with unit
activities in the interval [0,1]. We evaluate both non-modular and modular
network architectures and compare performance when trained and tested on
different kinds of sparse random binary pattern sets, including correlated
ones. We show that covariance learning has a robust but low storage capacity
under these conditions and that the Bayesian Confidence Propagation learning
rule (BCPNN) is superior with a good margin in all cases except one, reaching a
three times higher composite score than the second best learning rule tested.Comment: 24 pages, 9 figure
Learning representations in Bayesian Confidence Propagation neural networks
Unsupervised learning of hierarchical representations has been one of the
most vibrant research directions in deep learning during recent years. In this
work we study biologically inspired unsupervised strategies in neural networks
based on local Hebbian learning. We propose new mechanisms to extend the
Bayesian Confidence Propagating Neural Network (BCPNN) architecture, and
demonstrate their capability for unsupervised learning of salient hidden
representations when tested on the MNIST dataset
Characterizing Deep-Learning I/O Workloads in TensorFlow
The performance of Deep-Learning (DL) computing frameworks rely on the
performance of data ingestion and checkpointing. In fact, during the training,
a considerable high number of relatively small files are first loaded and
pre-processed on CPUs and then moved to accelerator for computation. In
addition, checkpointing and restart operations are carried out to allow DL
computing frameworks to restart quickly from a checkpoint. Because of this, I/O
affects the performance of DL applications. In this work, we characterize the
I/O performance and scaling of TensorFlow, an open-source programming framework
developed by Google and specifically designed for solving DL problems. To
measure TensorFlow I/O performance, we first design a micro-benchmark to
measure TensorFlow reads, and then use a TensorFlow mini-application based on
AlexNet to measure the performance cost of I/O and checkpointing in TensorFlow.
To improve the checkpointing performance, we design and implement a burst
buffer. We find that increasing the number of threads increases TensorFlow
bandwidth by a maximum of 2.3x and 7.8x on our benchmark environments. The use
of the tensorFlow prefetcher results in a complete overlap of computation on
accelerator and input pipeline on CPU eliminating the effective cost of I/O on
the overall performance. The use of a burst buffer to checkpoint to a fast
small capacity storage and copy asynchronously the checkpoints to a slower
large capacity storage resulted in a performance improvement of 2.6x with
respect to checkpointing directly to slower storage on our benchmark
environment.Comment: Accepted for publication at pdsw-DISCS 201
Stimulus detection rate and latency, firing rates and 1–40Hz oscillatory power are modulated by infra-slow fluctuations in a bistable attractor network model
Recordings of membrane and field potentials, firing rates, and oscillation amplitude dynamics show that neuronal activity levels in cortical and subcortical structures exhibit infra-slow fluctuations (ISFs) on time scales from seconds to hundreds of seconds. Similar ISFs are salient also in blood-oxygenation-level dependent (BOLD) signals as well as in psychophysical time series. Functional consequences of ISFs are not fully understood. Here, they were investigated along with dynamical implications of ISFs in large-scale simulations of cortical network activity. For this purpose, a biophysically detailed hierarchical attractor network model displaying bistability and operating in an oscillatory regime was used. ISFs were imposed as slow fluctuations in either the amplitude or frequency of fast synaptic noise. We found that both mechanisms produced an ISF component in the synthetic local field potentials (LFPs) and modulated the power of 1–40 Hz oscillations. Crucially, in a simulated threshold-stimulus detection task (TSDT), these ISFs were strongly correlated with stimulus detection probabilities and latencies. The results thus show that several phenomena observed in many empirical studies emerge concurrently in the model dynamics, which yields mechanistic insight into how infra-slow excitability fluctuations in large-scale neuronal networks may modulate fast oscillations and perceptual processing. The model also makes several novel predictions that can be experimentally tested in future studies
Gamma and beta bursts during working memory readout suggest roles in its volitional control
Working memory (WM) activity is not as stationary or sustained as previously thought. There are brief bursts of gamma (~50-120 Hz) and beta (~20-35 Hz) oscillations, the former linked to stimulus information in spiking. We examined these dynamics in relation to readout and control mechanisms of WM. Monkeys held sequences of two objects in WM to match to subsequent sequences. Changes in beta and gamma bursting suggested their distinct roles. In anticipation of having to use an object for the match decision, there was an increase in gamma and spiking information about that object and reduced beta bursting. This readout signal was only seen before relevant test objects, and was related to premotor activity. When the objects were no longer needed, beta increased and gamma decreased together with object spiking information. Deviations from these dynamics predicted behavioral errors. Thus, beta could regulate gamma and the information in WM.National Institute of Mental Health (U.S.) (Grant R37MH087027)United States. Office of Naval Research (Grant N00014-16-1-2832
- …