12 research outputs found

    The effects of pair-wise and higher order correlations on the firing rate of a post-synaptic neuron

    Get PDF
    Coincident firing of neurons projecting to a common target cell is likely to raise the probability of firing of this post-synaptic cell. Therefore synchronized firing constitutes a significant event for post-synaptic neurons and is likely to play a role in neuronal information processing. Physiological data on synchronized firing in cortical networks is primarily based on paired recordings and cross-correlation analysis. However, pair-wise correlations among all inputs onto a post-synaptic neuron do not uniquely determine the distribution of simultaneous post-synaptic events. We develop a framework in order to calculate the amount of synchronous firing that, based on maximum entropy, should exist in a homogeneous neural network in which the neurons have known pair-wise correlations and higher order structure is absent. According to the distribution of maximal entropy, synchronous events in which a large proportion of the neurons participates should exist, even in the case of weak pair-wise correlations. Network simulations also exhibit these highly synchronous events in the case of weak pair-wise correlations. If such a group of neurons provides input to a common post-synaptic target, these network bursts may enhance the impact of this input, especially in the case of a high post-synaptic threshold. Unfortunately, the proportion of neurons participating in synchronous bursts can be approximated by our method under restricted conditions. When these conditions are not fulfilled, the spike trains have less than maximal entropy, which is indicative of the presence of higher order structure. In this situation, the degree of synchronicity cannot be derived from the pair-wise correlations

    A biologically plausible learning rule for deep learning in the brain

    Get PDF
    Researchers have proposed that deep learning, which is providing important progress in a wide range of high complexity tasks, might inspire new insights into learning in the brain. However, the methods used for deep learning by artificial neural networks are biologically unrealistic and would need to be replaced by biologically realistic counterparts. Previous biologically plausible reinforcement learning rules, like AGREL and AuGMEnT, showed promising results but focused on shallow networks with three layers. Will these learning rules also generalize to networks with more layers and can they handle tasks of higher complexity? Here, we demonstrate that these learning schemes indeed generalize to deep networks, if we include an attention network that propagates information about the selected action to lower network levels. The resulting learning rule, called Q-AGREL, is equivalent to a particular form of error-backpropagation that trains one output unit at any one time. To demonstrate the utility of the learning scheme for larger problems, we trained networks with two hidden layers on the MNIST dataset, a standard and interesting Machine Learning task. Our results demonstrate that the capability of Q-AGREL is comparable to that of error backpropagation, although the learning rate is 1.5-2 times slower because the network has to learn by trial-and-error and updates the action value of only one output unit at a time. Our results provide new insights into how deep learning can be implemented in the brain

    Continuous-time on-policy neural reinforcement learning of working memory tasks

    Get PDF
    As living organisms, one of our primary characteristics is the ability to rapidly process and react to unknown and unexpected events. To this end, we are able to recognize an event or a sequence of events and learn to respond properly. Despite advances in machine learning, current cognitive robotic systems are not able to rapidly and efficiently respond in the real world: the challenge is to learn to recognize both what is important, and also when to act. Reinforcement Learning (RL) is typically used to solve complex tasks: to learn the how. To respond quickly - to learn when - the environment has to be sampled often enough. For “enough”, a programmer has to decide on the step-size as a time-representation, choosing between a fine-grained representation of time (many state-transitions; difficult to learn with RL) or to a coarse temporal resolution (easier to learn with RL but lacking precise timing). Here, we derive a continuous-time version of on-policy SARSA-learning in a working-memory neural network model, AuGMEnT. Using a neural working memory network resolves the what problem, our when solution is built on the notion that in the real world, instantaneous actions of duration dt are actually impossible. We demonstrate how we can decouple action duration from the internal time-steps in the neural RL model using an action selection system. The resultant CT-AuGMEnT successfully learns to react to the events of a continuous-time task, without any pre-imposed specifications about the duration of the events or the delays between them

    Biologically plausible multi-dimensional reinforcement learning in neural networks

    Get PDF
    How does the brain learn to map multi-dimensional sensory inputs to multi-dimensional motor outputs when it can only observe single rewards for the coordinated outputs of the whole network of neurons that make up the brain? We introduce Multi-AGREL, a novel, biologically plausible multi-layer neural network model for multi-dimensional reinforcement learning. We demonstrate that Multi-AGREL can learn non-linear mappings from inputs to multi-dimensional outputs by using only scalar reward feedback. We further show that in Multi-AGREL, the changes in the connection weights follow the gradient that minimizes global prediction error, and that all information required for synaptic plasticity is locally present

    Attention-Gated Brain Propagation: How the brain can implement reward-based error backpropagation

    Get PDF
    Much recent work has focused on biologically plausible variants of supervised learning algorithms. However, there is no teacher in the motor cortex that instructs the motor neurons and learning in the brain depends on reward and punishment. We demonstrate a biologically plausible reinforcement learning scheme for deep networks with an arbitrary number of layers. The network chooses an action by selecting a unit in the output layer and uses feedback connections to assign credit to the units in successively lower layers that are responsible for this action. After the choice, the network receives reinforcement and there is no teacher correcting the errors. We show how the new learning scheme – Attention-Gated Brain Propagation (BrainProp) – is mathematically equivalent to error backpropagation, for one output unit at a time. We demonstrate successful learning of deep fully connected, convolutional and locally connected networks on classical and hard image-classification benchmarks; MNIST, CIFAR10, CIFAR100 andTiny ImageNet. BrainProp achieves an accuracy that is equivalent to that of standard error-backpropagation, and better than state-of-the-art biologically inspired learning schemes. Additionally, the trial-and-error nature of learning is associated with limited additional training time so that BrainProp is a factor of 1-3.5 times slower. Our results thereby provide new insights into how deep learning may be implemented in the brain
    corecore