67,745 research outputs found

    Minimizing Control for Credit Assignment with Strong Feedback

    Full text link
    The success of deep learning ignited interest in whether the brain learns hierarchical representations using gradient-based learning. However, current biologically plausible methods for gradient-based credit assignment in deep neural networks need infinitesimally small feedback signals, which is problematic in biologically realistic noisy environments and at odds with experimental evidence in neuroscience showing that top-down feedback can significantly influence neural activity. Building upon deep feedback control (DFC), a recently proposed credit assignment method, we combine strong feedback influences on neural activity with gradient-based learning and show that this naturally leads to a novel view on neural network optimization. Instead of gradually changing the network weights towards configurations with low output loss, weight updates gradually minimize the amount of feedback required from a controller that drives the network to the supervised output label. Moreover, we show that the use of strong feedback in DFC allows learning forward and feedback connections simultaneously, using learning rules fully local in space and time. We complement our theoretical results with experiments on standard computer-vision benchmarks, showing competitive performance to backpropagation as well as robustness to noise. Overall, our work presents a fundamentally novel view of learning as control minimization, while sidestepping biologically unrealistic assumptions

    Learning Unmanned Aerial Vehicle Control for Autonomous Target Following

    Full text link
    While deep reinforcement learning (RL) methods have achieved unprecedented successes in a range of challenging problems, their applicability has been mainly limited to simulation or game domains due to the high sample complexity of the trial-and-error learning process. However, real-world robotic applications often need a data-efficient learning process with safety-critical constraints. In this paper, we consider the challenging problem of learning unmanned aerial vehicle (UAV) control for tracking a moving target. To acquire a strategy that combines perception and control, we represent the policy by a convolutional neural network. We develop a hierarchical approach that combines a model-free policy gradient method with a conventional feedback proportional-integral-derivative (PID) controller to enable stable learning without catastrophic failure. The neural network is trained by a combination of supervised learning from raw images and reinforcement learning from games of self-play. We show that the proposed approach can learn a target following policy in a simulator efficiently and the learned behavior can be successfully transferred to the DJI quadrotor platform for real-world UAV control

    Forward Signal Propagation Learning

    Full text link
    We propose a new learning algorithm for propagating a learning signal and updating neural network parameters via a forward pass, as an alternative to backpropagation. In forward signal propagation learning (sigprop), there is only the forward path for learning and inference, so there are no additional structural or computational constraints on learning, such as feedback connectivity, weight transport, or a backward pass, which exist under backpropagation. Sigprop enables global supervised learning with only a forward path. This is ideal for parallel training of layers or modules. In biology, this explains how neurons without feedback connections can still receive a global learning signal. In hardware, this provides an approach for global supervised learning without backward connectivity. Sigprop by design has better compatibility with models of learning in the brain and in hardware than backpropagation and alternative approaches to relaxing learning constraints. We also demonstrate that sigprop is more efficient in time and memory than they are. To further explain the behavior of sigprop, we provide evidence that sigprop provides useful learning signals in context to backpropagation. To further support relevance to biological and hardware learning, we use sigprop to train continuous time neural networks with Hebbian updates and train spiking neural networks without surrogate functions

    Input-driven unsupervised learning in recurrent neural networks

    Get PDF
    Understanding the theoretical foundations of how memories are encoded and retrieved in neural populations is a central challenge in neuroscience. A popular theoretical scenario for modeling memory function is an attractor neural network with Hebbian learning (e.g. the Hopfield model). The model simplicity and the locality of the synaptic update rules come at the cost of a limited storage capacity, compared with the capacity achieved with supervised learning algorithms, whose biological plausibility is questionable. Here, we present an on-line learning rule for a recurrent neural network that achieves near-optimal performance without an explicit supervisory error signal and using only locally accessible information, and which is therefore biologically plausible. The fully connected network consists of excitatory units with plastic recurrent connections and non-plastic inhibitory feedback stabilizing the network dynamics; the patterns to be memorized are presented on-line as strong afferent currents, producing a bimodal distribution for the neuron synaptic inputs ('local fields'). Synapses corresponding to active inputs are modified as a function of the position of the local field with respect to three thresholds. Above the highest threshold, and below the lowest threshold, no plasticity occurs. In between these two thresholds, potentiation/depression occurs when the local field is above/below an intermediate threshold. An additional parameter of the model allows to trade storage capacity for robustness, i.e. increased size of the basins of attraction. We simulated a network of 1001 excitatory neurons implementing this rule and measured its storage capacity for different sizes of the basins of attraction: our results show that, for any given basin size, our network more than doubles the storage capacity, compared with a standard Hopfield network. Our learning rule is consistent with available experimental data documenting how plasticity depends on firing rate. It predicts that at high enough firing rates, no potentiation should occu

    Learning in large-scale spiking neural networks

    Get PDF
    Learning is central to the exploration of intelligence. Psychology and machine learning provide high-level explanations of how rational agents learn. Neuroscience provides low-level descriptions of how the brain changes as a result of learning. This thesis attempts to bridge the gap between these two levels of description by solving problems using machine learning ideas, implemented in biologically plausible spiking neural networks with experimentally supported learning rules. We present three novel neural models that contribute to the understanding of how the brain might solve the three main problems posed by machine learning: supervised learning, in which the rational agent has a fine-grained feedback signal, reinforcement learning, in which the agent gets sparse feedback, and unsupervised learning, in which the agents has no explicit environmental feedback. In supervised learning, we argue that previous models of supervised learning in spiking neural networks solve a problem that is less general than the supervised learning problem posed by machine learning. We use an existing learning rule to solve the general supervised learning problem with a spiking neural network. We show that the learning rule can be mapped onto the well-known backpropagation rule used in artificial neural networks. In reinforcement learning, we augment an existing model of the basal ganglia to implement a simple actor-critic model that has a direct mapping to brain areas. The model is used to recreate behavioural and neural results from an experimental study of rats performing a simple reinforcement learning task. In unsupervised learning, we show that the BCM rule, a common learning rule used in unsupervised learning with rate-based neurons, can be adapted to a spiking neural network. We recreate the effects of STDP, a learning rule with strict time dependencies, using BCM, which does not explicitly remember the times of previous spikes. The simulations suggest that BCM is a more general rule than STDP. Finally, we propose a novel learning rule that can be used in all three of these simulations. The existence of such a rule suggests that the three types of learning examined separately in machine learning may not be implemented with separate processes in the brain

    Activation Learning by Local Competitions

    Full text link
    Despite its great success, backpropagation has certain limitations that necessitate the investigation of new learning methods. In this study, we present a biologically plausible local learning rule that improves upon Hebb's well-known proposal and discovers unsupervised features by local competitions among neurons. This simple learning rule enables the creation of a forward learning paradigm called activation learning, in which the output activation (sum of the squared output) of the neural network estimates the likelihood of the input patterns, or "learn more, activate more" in simpler terms. For classification on a few small classical datasets, activation learning performs comparably to backpropagation using a fully connected network, and outperforms backpropagation when there are fewer training samples or unpredictable disturbances. Additionally, the same trained network can be used for a variety of tasks, including image generation and completion. Activation learning also achieves state-of-the-art performance on several real-world datasets for anomaly detection. This new learning paradigm, which has the potential to unify supervised, unsupervised, and semi-supervised learning and is reasonably more resistant to adversarial attacks, deserves in-depth investigation.Comment: Updated Equation (13) for the modification rule with feedback; Adding discussions regarding activation learning for anormaly detectio
    • …
    corecore