119 research outputs found
Back-Propagation of Physiological Action Potential Output in Dendrites of Slender-Tufted L5A Pyramidal Neurons
Pyramidal neurons of layer 5A are a major neocortical output type and clearly distinguished from layer 5B pyramidal neurons with respect to morphology, in vivo firing patterns, and connectivity; yet knowledge of their dendritic properties is scant. We used a combination of whole-cell recordings and Ca2+ imaging techniques in vitro to explore the specific dendritic signaling role of physiological action potential patterns recorded in vivo in layer 5A pyramidal neurons of the whisker-related ‘barrel cortex’. Our data provide evidence that the temporal structure of physiological action potential patterns is crucial for an effective invasion of the main apical dendrites up to the major branch point. Both the critical frequency enabling action potential trains to invade efficiently and the dendritic calcium profile changed during postnatal development. In contrast to the main apical dendrite, the more passive properties of the short basal and apical tuft dendrites prevented an efficient back-propagation. Various Ca2+ channel types contributed to the enhanced calcium signals during high-frequency firing activity, whereas A-type K+ and BKCa channels strongly suppressed it. Our data support models in which the interaction of synaptic input with action potential output is a function of the timing, rate and pattern of action potentials, and dendritic location
On the reversed bias-variance tradeoff in deep ensembles
Deep ensembles aggregate predictions of diverse neural networks to improve generalisation and quantify uncertainty. Here, we investigate their behavior when increasing the ensemble mem- bers’ parameter size - a practice typically asso- ciated with better performance for single mod- els. We show that under practical assumptions in the overparametrized regime far into the dou- ble descent curve, not only the ensemble test loss degrades, but common out-of-distribution detec- tion and calibration metrics suffer as well. Rem- iniscent to deep double descent, we observe this phenomenon not only when increasing the single member’s capacity but also as we increase the training budget, suggesting deep ensembles can benefit from early stopping. This sheds light on the success and failure modes of deep ensembles and suggests that averaging finite width models perform better than the neural tangent kernel limit for these metrics
Unsupervised Musical Object Discovery from Audio
Current object-centric learning models such as the popular SlotAttention
architecture allow for unsupervised visual scene decomposition. Our novel
MusicSlots method adapts SlotAttention to the audio domain, to achieve
unsupervised music decomposition. Since concepts of opacity and occlusion in
vision have no auditory analogues, the softmax normalization of alpha masks in
the decoders of visual object-centric models is not well-suited for decomposing
audio objects. MusicSlots overcomes this problem. We introduce a
spectrogram-based multi-object music dataset tailored to evaluate
object-centric learning on western tonal music. MusicSlots achieves good
performance on unsupervised note discovery and outperforms several established
baselines on supervised note property prediction tasks.Comment: Accepted to Machine Learning for Audio Workshop, NeurIPS 202
Continual Learning in Recurrent Neural Networks with Hypernetworks
The last decade has seen a surge of interest in continual learning (CL), and
a variety of methods have been developed to alleviate catastrophic forgetting.
However, most prior work has focused on tasks with static data, while CL on
sequential data has remained largely unexplored. Here we address this gap in
two ways. First, we evaluate the performance of established CL methods when
applied to recurrent neural networks (RNNs). We primarily focus on elastic
weight consolidation, which is limited by a stability-plasticity trade-off, and
explore the particularities of this trade-off when using sequential data. We
show that high working memory requirements, but not necessarily sequence
length, lead to an increased need for stability at the cost of decreased
performance on subsequent tasks. Second, to overcome this limitation we employ
a recent method based on hypernetworks and apply it to RNNs to address
catastrophic forgetting on sequential data. By generating the weights of a main
RNN in a task-dependent manner, our approach disentangles stability and
plasticity, and outperforms alternative methods in a range of experiments.
Overall, our work provides several key insights on the differences between CL
in feedforward networks and in RNNs, while offering a novel solution to
effectively tackle CL on sequential data.Comment: 13 pages and 4 figures in the main text; 20 pages and 2 figures in
the supplementary material
Fast two-layer two-photon imaging of neuronal cell populations using an electrically tunable lens
Functional two-photon Ca2+-imaging is a versatile tool to study the dynamics of neuronal populations in brain slices and living animals. However, population imaging is typically restricted to a single two-dimensional image plane. By introducing an electrically tunable lens into the excitation path of a two-photon microscope we were able to realize fast axial focus shifts within 15 ms. The maximum axial scan range was 0.7 mm employing a 40x NA0.8 water immersion objective, plenty for typically required ranges of 0.2–0.3 mm. By combining the axial scanning method with 2D acousto-optic frame scanning and random-access scanning, we measured neuronal population activity of about 40 neurons across two imaging planes separated by 40 μm and achieved scan rates up to 20–30 Hz. The method presented is easily applicable and allows upgrading of existing two-photon microscopes for fast 3D scanning
Minimizing Control for Credit Assignment with Strong Feedback
The success of deep learning ignited interest in whether the brain learns hierarchical representations using gradient-based learning. However, current biologically plausible methods for gradient-based credit assignment in deep neural networks need infinitesimally small feedback signals, which is problematic in biologically realistic noisy environments and at odds with experimental evidence in neuroscience showing that top-down feedback can significantly influence neural activity. Building upon deep feedback control (DFC), a recently proposed credit assignment method, we combine strong feedback influences on neural activity with gradient-based learning and show that this naturally leads to a novel view on neural network optimization. Instead of gradually changing the network weights towards configurations with low output loss, weight updates gradually minimize the amount of feedback required from a controller that drives the network to the supervised output label. Moreover, we show that the use of strong feedback in DFC allows learning forward and feedback connections simultaneously, using learning rules fully local in space and time. We complement our theoretical results with experiments on standard computer-vision benchmarks, showing competitive performance to backpropagation as well as robustness to noise. Overall, our work presents a fundamentally novel view of learning as control minimization, while sidestepping biologically unrealistic assumptions
Neural networks with late-phase weights
The largely successful method of training neural networks is to learn their
weights using some variant of stochastic gradient descent (SGD). Here, we show
that the solutions found by SGD can be further improved by ensembling a subset
of the weights in late stages of learning. At the end of learning, we obtain
back a single model by taking a spatial average in weight space. To avoid
incurring increased computational costs, we investigate a family of
low-dimensional late-phase weight models which interact multiplicatively with
the remaining parameters. Our results show that augmenting standard models with
late-phase weights improves generalization in established benchmarks such as
CIFAR-10/100, ImageNet and enwik8. These findings are complemented with a
theoretical analysis of a noisy quadratic problem which provides a simplified
picture of the late phases of neural network learning.Comment: 25 pages, 6 figure
- …