89 research outputs found

    Theory of representation learning in cortical neural networks

    Get PDF
    Our brain continuously self-organizes to construct and maintain an internal representation of the world based on the information arriving through sensory stimuli. Remarkably, cortical areas related to different sensory modalities appear to share the same functional unit, the neuron, and develop through the same learning mechanism, synaptic plasticity. It motivates the conjecture of a unifying theory to explain cortical representational learning across sensory modalities. In this thesis we present theories and computational models of learning and optimization in neural networks, postulating functional properties of synaptic plasticity that support the apparent universal learning capacity of cortical networks. In the past decades, a variety of theories and models have been proposed to describe receptive field formation in sensory areas. They include normative models such as sparse coding, and bottom-up models such as spike-timing dependent plasticity. We bring together candidate explanations by demonstrating that in fact a single principle is sufficient to explain receptive field development. First, we show that many representative models of sensory development are in fact implementing variations of a common principle: nonlinear Hebbian learning. Second, we reveal that nonlinear Hebbian learning is sufficient for receptive field formation through sensory inputs. A surprising result is that our findings are independent of specific details, and allow for robust predictions of the learned receptive fields. Thus nonlinear Hebbian learning and natural statistics can account for many aspects of receptive field formation across models and sensory modalities. The Hebbian learning theory substantiates that synaptic plasticity can be interpreted as an optimization procedure, implementing stochastic gradient descent. In stochastic gradient descent inputs arrive sequentially, as in sensory streams. However, individual data samples have very little information about the correct learning signal, and it becomes a fundamental problem to know how many samples are required for reliable synaptic changes. Through estimation theory, we develop a novel adaptive learning rate model, that adapts the magnitude of synaptic changes based on the statistics of the learning signal, enabling an optimal use of data samples. Our model has a simple implementation and demonstrates improved learning speed, making this a promising candidate for large artificial neural network applications. The model also makes predictions on how cortical plasticity may modulate synaptic plasticity for optimal learning. The optimal sampling size for reliable learning allows us to estimate optimal learning times for a given model. We apply this theory to derive analytical bounds on times for the optimization of synaptic connections. First, we show this optimization problem to have exponentially many saddle-nodes, which lead to small gradients and slow learning. Second, we show that the number of input synapses to a neuron modulates the magnitude of the initial gradient, determining the duration of learning. Our final result reveals that the learning duration increases supra-linearly with the number of synapses, suggesting an effective limit on synaptic connections and receptive field sizes in developing neural networks

    The Shallow and the Deep:A biased introduction to neural networks and old school machine learning

    Get PDF
    The Shallow and the Deep is a collection of lecture notes that offers an accessible introduction to neural networks and machine learning in general. However, it was clear from the beginning that these notes would not be able to cover this rapidly changing and growing field in its entirety. The focus lies on classical machine learning techniques, with a bias towards classification and regression. Other learning paradigms and many recent developments in, for instance, Deep Learning are not addressed or only briefly touched upon.Biehl argues that having a solid knowledge of the foundations of the field is essential, especially for anyone who wants to explore the world of machine learning with an ambition that goes beyond the application of some software package to some data set. Therefore, The Shallow and the Deep places emphasis on fundamental concepts and theoretical background. This also involves delving into the history and pre-history of neural networks, where the foundations for most of the recent developments were laid. These notes aim to demystify machine learning and neural networks without losing the appreciation for their impressive power and versatility

    Renormalization group theory, scaling laws and deep learning

    Full text link
    The question of the possibility of intelligent machines is fundamentally intertwined with the machines’ ability to reason. Or not. The developments of the recent years point in a completely different direction : What we need is simple, generic but scalable algorithms that can keep learning on their own. This thesis is an attempt to find theoretical explanations to the findings of recent years where empirical evidence has been presented in support of phase transitions in neural networks, power law behavior of various entities, and even evidence of algorithmic universality, all of which are beautifully explained in the context of statistical physics, quantum field theory and statistical field theory but not necessarily in the context of deep learning where no complete theoretical framework is available. Inspired by these developments, and as it turns out, with the overly ambitious goal of providing a solid theoretical explanation of the empirically observed power laws in neu- ral networks, we set out to substantiate the claims that renormalization group theory may be the sought-after theory of deep learning which may explain the above, as well as what we call algorithmic universality.La question de la possibilité de machines intelligentes est intimement liée à la capacité de ces machines à raisonner. Ou pas. Les développements des dernières années indiquent une direction complètement différente : ce dont nous avons besoin sont des algorithmes simples, génériques mais évolutifs qui peuvent continuer à apprendre de leur propre chef. Cette thèse est une tentative de trouver des explications théoriques aux constatations des dernières années où des preuves empiriques ont été présentées en faveur de transitions de phase dans les réseaux de neurones, du comportement en loi de puissance de diverses entités, et même de l'universialité algorithmique, tout cela étant parfaitement expliqué dans le contexte de la physique statistique, de la théorie quantique des champs et de la théorie statistique des champs, mais pas nécessairement dans le contexte de l'apprentissage profond où aucun cadre théorique complet n'est disponible. Inspiré par ces développements, et comme il s'avère, avec le but ambitieux de fournir une explication théorique solide des lois de puissance empiriquement observées dans les réseaux de neurones, nous avons entrepris de étayer les affirmations selon lesquelles la théorie du groupe de renormalisation pourrait être la théorie recherchée de l'apprentissage profond qui pourrait expliquer cela, ainsi que ce que nous appelons l'universialité algorithmique

    Analog VLSI circuit design of spike-timing-dependent synaptic plasticity

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Cataloged from PDF version of thesis.Includes bibliographical references (p. 61-63).Synaptic plasticity is the ability of a synaptic connection to change in strength and is believed to be the basis for learning and memory. Currently, two types of synaptic plasticity exist. First is the spike-timing-dependent-plasticity (STDP), a timing-based protocol that suggests that the efficacy of synaptic connections is modulated by the relative timing between presynaptic and postsynaptic stimuli. The second type is the Bienenstock-Cooper-Munro (BCM) learning rule, a classical ratebased protocol which states that the rate of presynaptic stimulation modulates the synaptic strength. Several theoretical models were developed to explain the two forms of plasticity but none of these models came close in identifying the biophysical mechanism of plasticity. Other studies focused instead on developing neuromorphic systems of synaptic plasticity. These systems used simple curve fitting methods that were able to reproduce some types of STDP but still failed to shed light on the biophysical basis of STDP. Furthermore, none of these neuromorphic systems were able to reproduce the various forms of STDP and relate them to the BCM rule. However, a recent discovery resulted in a new unified model that explains the general biophysical process governing synaptic plasticity using fundamental ideas regarding the biochemical reactions and kinetics within the synapse. This brilliant model considers all types of STDP and relates them to the BCM rule, giving us a fresh new approach to construct a unique system that overcomes all the challenges that existing neuromorphic systems faced. Here, we propose a novel analog verylarge- scale-integration (aVLSI) circuit that successfully and accurately captures the whole picture of synaptic plasticity based from the results of this latest unified model. Our circuit was tested for all types of STDP and for each of these tests, our design was able to reproduce the results predicted by the new-found model. Two inputs are required by the system, a glutamate signal that carries information about the presynaptic stimuli and a dendritic action potential signal that contains information about the postsynaptic stimuli. These two inputs give rise to changes in the excitatory postsynaptic current which represents the modifiable synaptic efficacy output. Finally, we also present several techniques and alternative circuit designs that will further improve the performance of our neuromorphic system.by Joshua Jen C. Monzon.M.Eng
    • …
    corecore