119 research outputs found

    Back-Propagation of Physiological Action Potential Output in Dendrites of Slender-Tufted L5A Pyramidal Neurons

    Get PDF
    Pyramidal neurons of layer 5A are a major neocortical output type and clearly distinguished from layer 5B pyramidal neurons with respect to morphology, in vivo firing patterns, and connectivity; yet knowledge of their dendritic properties is scant. We used a combination of whole-cell recordings and Ca2+ imaging techniques in vitro to explore the specific dendritic signaling role of physiological action potential patterns recorded in vivo in layer 5A pyramidal neurons of the whisker-related ‘barrel cortex’. Our data provide evidence that the temporal structure of physiological action potential patterns is crucial for an effective invasion of the main apical dendrites up to the major branch point. Both the critical frequency enabling action potential trains to invade efficiently and the dendritic calcium profile changed during postnatal development. In contrast to the main apical dendrite, the more passive properties of the short basal and apical tuft dendrites prevented an efficient back-propagation. Various Ca2+ channel types contributed to the enhanced calcium signals during high-frequency firing activity, whereas A-type K+ and BKCa channels strongly suppressed it. Our data support models in which the interaction of synaptic input with action potential output is a function of the timing, rate and pattern of action potentials, and dendritic location

    On the reversed bias-variance tradeoff in deep ensembles

    Full text link
    Deep ensembles aggregate predictions of diverse neural networks to improve generalisation and quantify uncertainty. Here, we investigate their behavior when increasing the ensemble mem- bers’ parameter size - a practice typically asso- ciated with better performance for single mod- els. We show that under practical assumptions in the overparametrized regime far into the dou- ble descent curve, not only the ensemble test loss degrades, but common out-of-distribution detec- tion and calibration metrics suffer as well. Rem- iniscent to deep double descent, we observe this phenomenon not only when increasing the single member’s capacity but also as we increase the training budget, suggesting deep ensembles can benefit from early stopping. This sheds light on the success and failure modes of deep ensembles and suggests that averaging finite width models perform better than the neural tangent kernel limit for these metrics

    Unsupervised Musical Object Discovery from Audio

    Full text link
    Current object-centric learning models such as the popular SlotAttention architecture allow for unsupervised visual scene decomposition. Our novel MusicSlots method adapts SlotAttention to the audio domain, to achieve unsupervised music decomposition. Since concepts of opacity and occlusion in vision have no auditory analogues, the softmax normalization of alpha masks in the decoders of visual object-centric models is not well-suited for decomposing audio objects. MusicSlots overcomes this problem. We introduce a spectrogram-based multi-object music dataset tailored to evaluate object-centric learning on western tonal music. MusicSlots achieves good performance on unsupervised note discovery and outperforms several established baselines on supervised note property prediction tasks.Comment: Accepted to Machine Learning for Audio Workshop, NeurIPS 202

    Continual Learning in Recurrent Neural Networks with Hypernetworks

    Full text link
    The last decade has seen a surge of interest in continual learning (CL), and a variety of methods have been developed to alleviate catastrophic forgetting. However, most prior work has focused on tasks with static data, while CL on sequential data has remained largely unexplored. Here we address this gap in two ways. First, we evaluate the performance of established CL methods when applied to recurrent neural networks (RNNs). We primarily focus on elastic weight consolidation, which is limited by a stability-plasticity trade-off, and explore the particularities of this trade-off when using sequential data. We show that high working memory requirements, but not necessarily sequence length, lead to an increased need for stability at the cost of decreased performance on subsequent tasks. Second, to overcome this limitation we employ a recent method based on hypernetworks and apply it to RNNs to address catastrophic forgetting on sequential data. By generating the weights of a main RNN in a task-dependent manner, our approach disentangles stability and plasticity, and outperforms alternative methods in a range of experiments. Overall, our work provides several key insights on the differences between CL in feedforward networks and in RNNs, while offering a novel solution to effectively tackle CL on sequential data.Comment: 13 pages and 4 figures in the main text; 20 pages and 2 figures in the supplementary material

    Fast two-layer two-photon imaging of neuronal cell populations using an electrically tunable lens

    Get PDF
    Functional two-photon Ca2+-imaging is a versatile tool to study the dynamics of neuronal populations in brain slices and living animals. However, population imaging is typically restricted to a single two-dimensional image plane. By introducing an electrically tunable lens into the excitation path of a two-photon microscope we were able to realize fast axial focus shifts within 15 ms. The maximum axial scan range was 0.7 mm employing a 40x NA0.8 water immersion objective, plenty for typically required ranges of 0.2–0.3 mm. By combining the axial scanning method with 2D acousto-optic frame scanning and random-access scanning, we measured neuronal population activity of about 40 neurons across two imaging planes separated by 40 μm and achieved scan rates up to 20–30 Hz. The method presented is easily applicable and allows upgrading of existing two-photon microscopes for fast 3D scanning

    Minimizing Control for Credit Assignment with Strong Feedback

    Full text link
    The success of deep learning ignited interest in whether the brain learns hierarchical representations using gradient-based learning. However, current biologically plausible methods for gradient-based credit assignment in deep neural networks need infinitesimally small feedback signals, which is problematic in biologically realistic noisy environments and at odds with experimental evidence in neuroscience showing that top-down feedback can significantly influence neural activity. Building upon deep feedback control (DFC), a recently proposed credit assignment method, we combine strong feedback influences on neural activity with gradient-based learning and show that this naturally leads to a novel view on neural network optimization. Instead of gradually changing the network weights towards configurations with low output loss, weight updates gradually minimize the amount of feedback required from a controller that drives the network to the supervised output label. Moreover, we show that the use of strong feedback in DFC allows learning forward and feedback connections simultaneously, using learning rules fully local in space and time. We complement our theoretical results with experiments on standard computer-vision benchmarks, showing competitive performance to backpropagation as well as robustness to noise. Overall, our work presents a fundamentally novel view of learning as control minimization, while sidestepping biologically unrealistic assumptions

    Neural networks with late-phase weights

    Full text link
    The largely successful method of training neural networks is to learn their weights using some variant of stochastic gradient descent (SGD). Here, we show that the solutions found by SGD can be further improved by ensembling a subset of the weights in late stages of learning. At the end of learning, we obtain back a single model by taking a spatial average in weight space. To avoid incurring increased computational costs, we investigate a family of low-dimensional late-phase weight models which interact multiplicatively with the remaining parameters. Our results show that augmenting standard models with late-phase weights improves generalization in established benchmarks such as CIFAR-10/100, ImageNet and enwik8. These findings are complemented with a theoretical analysis of a noisy quadratic problem which provides a simplified picture of the late phases of neural network learning.Comment: 25 pages, 6 figure
    corecore