161 research outputs found
Back-Propagation of Physiological Action Potential Output in Dendrites of Slender-Tufted L5A Pyramidal Neurons
Pyramidal neurons of layer 5A are a major neocortical output type and clearly distinguished from layer 5B pyramidal neurons with respect to morphology, in vivo firing patterns, and connectivity; yet knowledge of their dendritic properties is scant. We used a combination of whole-cell recordings and Ca2+ imaging techniques in vitro to explore the specific dendritic signaling role of physiological action potential patterns recorded in vivo in layer 5A pyramidal neurons of the whisker-related ‘barrel cortex’. Our data provide evidence that the temporal structure of physiological action potential patterns is crucial for an effective invasion of the main apical dendrites up to the major branch point. Both the critical frequency enabling action potential trains to invade efficiently and the dendritic calcium profile changed during postnatal development. In contrast to the main apical dendrite, the more passive properties of the short basal and apical tuft dendrites prevented an efficient back-propagation. Various Ca2+ channel types contributed to the enhanced calcium signals during high-frequency firing activity, whereas A-type K+ and BKCa channels strongly suppressed it. Our data support models in which the interaction of synaptic input with action potential output is a function of the timing, rate and pattern of action potentials, and dendritic location
On the reversed bias-variance tradeoff in deep ensembles
Deep ensembles aggregate predictions of diverse neural networks to improve generalisation and quantify uncertainty. Here, we investigate their behavior when increasing the ensemble mem- bers’ parameter size - a practice typically asso- ciated with better performance for single mod- els. We show that under practical assumptions in the overparametrized regime far into the dou- ble descent curve, not only the ensemble test loss degrades, but common out-of-distribution detec- tion and calibration metrics suffer as well. Rem- iniscent to deep double descent, we observe this phenomenon not only when increasing the single member’s capacity but also as we increase the training budget, suggesting deep ensembles can benefit from early stopping. This sheds light on the success and failure modes of deep ensembles and suggests that averaging finite width models perform better than the neural tangent kernel limit for these metrics
Bio-inspired, task-free continual learning through activity regularization
The ability to sequentially learn multiple tasks without forgetting is a key skill of biological brains, whereas it represents a major challenge to the field of deep learning. To avoid catastrophic forgetting, various continual learning (CL) approaches have been devised. However, these usually require discrete task boundaries. This requirement seems biologically implausible and often limits the application of CL methods in the real world where tasks are not always well defined. Here, we take inspiration from neuroscience, where sparse, non-overlapping neuronal representations have been suggested to prevent catastrophic forgetting. As in the brain, we argue that these sparse representations should be chosen on the basis of feed forward (stimulus-specific) as well as top-down (context-specific) information. To implement such selective sparsity, we use a bio-plausible form of hierarchical credit assignment known as Deep Feedback Control (DFC) and combine it with a winner-take-all sparsity mechanism. In addition to sparsity, we introduce lateral recurrent connections within each layer to further protect previously learned representations. We evaluate the new sparse-recurrent version of DFC on the split-MNIST computer vision benchmark and show that only the combination of sparsity and intra-layer recurrent connections improves CL performance with respect to standard backpropagation. Our method achieves similar performance to well-known CL methods, such as Elastic Weight Consolidation and Synaptic Intelligence, without requiring information about task boundaries. Overall, we showcase the idea of adopting computational principles from the brain to derive new, task-free learning algorithms for CL
Unsupervised Musical Object Discovery from Audio
Current object-centric learning models such as the popular SlotAttention
architecture allow for unsupervised visual scene decomposition. Our novel
MusicSlots method adapts SlotAttention to the audio domain, to achieve
unsupervised music decomposition. Since concepts of opacity and occlusion in
vision have no auditory analogues, the softmax normalization of alpha masks in
the decoders of visual object-centric models is not well-suited for decomposing
audio objects. MusicSlots overcomes this problem. We introduce a
spectrogram-based multi-object music dataset tailored to evaluate
object-centric learning on western tonal music. MusicSlots achieves good
performance on unsupervised note discovery and outperforms several established
baselines on supervised note property prediction tasks.Comment: Accepted to Machine Learning for Audio Workshop, NeurIPS 202
Continual Learning in Recurrent Neural Networks with Hypernetworks
The last decade has seen a surge of interest in continual learning (CL), and
a variety of methods have been developed to alleviate catastrophic forgetting.
However, most prior work has focused on tasks with static data, while CL on
sequential data has remained largely unexplored. Here we address this gap in
two ways. First, we evaluate the performance of established CL methods when
applied to recurrent neural networks (RNNs). We primarily focus on elastic
weight consolidation, which is limited by a stability-plasticity trade-off, and
explore the particularities of this trade-off when using sequential data. We
show that high working memory requirements, but not necessarily sequence
length, lead to an increased need for stability at the cost of decreased
performance on subsequent tasks. Second, to overcome this limitation we employ
a recent method based on hypernetworks and apply it to RNNs to address
catastrophic forgetting on sequential data. By generating the weights of a main
RNN in a task-dependent manner, our approach disentangles stability and
plasticity, and outperforms alternative methods in a range of experiments.
Overall, our work provides several key insights on the differences between CL
in feedforward networks and in RNNs, while offering a novel solution to
effectively tackle CL on sequential data.Comment: 13 pages and 4 figures in the main text; 20 pages and 2 figures in
the supplementary material
Learning cortical hierarchies with temporal Hebbian updates
A key driver of mammalian intelligence is the ability to represent incoming sensory information across multiple abstraction levels. For example, in the visual ventral stream, incoming signals are first represented as low-level edge filters and then transformed into high-level object representations. Similar hierarchical structures routinely emerge in artificial neural networks (ANNs) trained for object recognition tasks, suggesting that similar structures may underlie biological neural networks. However, the classical ANN training algorithm, backpropagation, is considered biologically implausible, and thus alternative biologically plausible training methods have been developed such as Equilibrium Propagation, Deep Feedback Control, Supervised Predictive Coding, and Dendritic Error Backpropagation. Several of those models propose that local errors are calculated for each neuron by comparing apical and somatic activities. Notwithstanding, from a neuroscience perspective, it is not clear how a neuron could compare compartmental signals. Here, we propose a solution to this problem in that we let the apical feedback signal change the postsynaptic firing rate and combine this with a differential Hebbian update, a rate-based version of classical spiking time-dependent plasticity (STDP). We prove that weight updates of this form minimize two alternative loss functions that we prove to be equivalent to the error-based losses used in machine learning: the inference latency and the amount of top-down feedback necessary. Moreover, we show that the use of differential Hebbian updates works similarly well in other feedback-based deep learning frameworks such as Predictive Coding or Equilibrium Propagation. Finally, our work removes a key requirement of biologically plausible models for deep learning and proposes a learning mechanism that would explain how temporal Hebbian learning rules can implement supervised hierarchical learning
Homomorphism AutoEncoder — Learning Group Structured Representations from Observed Transitions
How can agents learn internal models that veridically represent interactions with the real world is a largely open question. As machine learning is moving towards representations containing not just observational but also interventional knowledge, we study this problem using tools from representation learning and group theory. We propose methods enabling an agent acting upon the world to learn internal representations of sensory information that are consistent with actions that modify it. We use an autoencoder equipped with a group representation acting on its latent space, trained using an equivariance-derived loss in order to enforce a suitable homomorphism property on the group representation. In contrast to existing work, our approach does not require prior knowledge of the group and does not restrict the set of actions the agent can perform. We motivate our method theoretically, and show empirically that it can learn a group representation of the actions, thereby capturing the structure of the set of transformations applied to the environment. We further show that this allows agents to predict the effect of sequences of future actions with improved accuracy
- …