19 research outputs found

    Effective and Efficient Computation with Multiple-timescale Spiking Recurrent Neural Networks

    Get PDF
    The emergence of brain-inspired neuromorphic computing as a paradigm for edge AI is motivating the search for high-performance and efficient spiking neural networks to run on this hardware. However, compared to classical neural networks in deep learning, current spiking neural networks lack competitive performance in compelling areas. Here, for sequential and streaming tasks, we demonstrate how a novel type of adaptive spiking recurrent neural network (SRNN) is able to achieve state-of-the-art performance compared to other spiking neural networks and almost reach or exceed the performance of classical recurrent neural networks (RNNs) while exhibiting sparse activity. From this, we calculate a >>100x energy improvement for our SRNNs over classical RNNs on the harder tasks. To achieve this, we model standard and adaptive multiple-timescale spiking neurons as self-recurrent neural units, and leverage surrogate gradients and auto-differentiation in the PyTorch Deep Learning framework to efficiently implement backpropagation-through-time, including learning of the important spiking neuron parameters to adapt our spiking neurons to the tasks.Comment: 11 pages,5 figure

    Accurate and efficient time-domain classification with adaptive spiking recurrent neural networks

    Get PDF
    Inspired by detailed modelling of biological neurons, spiking neural networks (SNNs) are investigated as biologically plausible and high-performance models of neural computation. The sparse and binary communication between spiking neurons potentially enables powerful and energy-efficient neural networks. The performance of SNNs, however, has remained lacking compared with artificial neural networks. Here we demonstrate how an activity-regularizing surrogate gradient combined with recurrent networks of tunable and adaptive spiking neurons yields the state of the art for SNNs on challenging benchmarks in the time domain, such as speech and gesture recognition. This also exceeds the performance of standard classical recurrent neural networks and approaches that of the best modern artificial neural networks. As these SNNs exhibit sparse spiking, we show that they are theoretically one to three orders of magnitude more computationally efficient compared to recurrent neural networks with similar performance. Together, this positions SNNs as an attractive solution for AI hardware implementations

    Local minimization of prediction errors drives learning of invariant object representations in a generative network model of visual perception

    Get PDF
    The ventral visual processing hierarchy of the cortex needs to fulfill at least two key functions: perceived objects must be mapped to high-level representations invariantly of the precise viewing conditions, and a generative model must be learned that allows, for instance, to fill in occluded information guided by visual experience. Here, we show how a multilayered predictive coding network can learn to recognize objects from the bottom up and to generate specific representations via a top-down pathway through a single learning rule: the local minimization of prediction errors. Trained on sequences of continuously transformed objects, neurons in the highest network area become tuned to object identity invariant of precise position, comparable to inferotemporal neurons in macaques. Drawing on this, the dynamic properties of invariant object representations reproduce experimentally observed hierarchies of timescales from low to high levels of the ventral processing stream. The predicted faster decorrelation of error-neuron activity compared to representation neurons is of relevance for the experimental search for neural correlates of prediction errors. Lastly, the generative capacity of the network is confirmed by reconstructing specific object images, robust to partial occlusion of the inputs. By learning invariance from temporal continuity within a generative model, the approach generalizes the predictive coding framework to dynamic inputs in a more biologically plausible way than self-supervised networks with non-local error-backpropagation. This was achieved simply by shifting the training paradigm to dynamic inputs, with little change in architecture and learning rule from static input-reconstructing Hebbian predictive coding networks

    Markov chain generative adversarial neural networks for solving Bayesian inverse problems in physics applications

    Get PDF
    In the context of solving inverse problems for physics applications within a Bayesian framework, we present a new approach, the Markov Chain Generative Adversarial Neural Network (MCGAN), to alleviate the computational costs associated with solving the Bayesian inference problem. GANs pose a very suitable framework to aid in the solution of Bayesian inference problems, as they are designed to generate samples from complicated high-dimensional distributions. By training a GAN to sample from a low-dimensional latent space and then embedding it in a Markov Chain Monte Carlo method, we can highly efficiently sample from the posterior, by replacing both the high-dimensional prior and the expensive forward map. This comes at the cost of a potentially expensive offline stage in which training data must be simulated or gathered and the GAN has to be trained. We prove that the proposed methodology converges to the true posterior in the Wasserstein-1 distance and that sampling from the latent space is equivalent to sampling in the high-dimensional space in a weak sense. The method is showcased in two test cases where we perform both state and parameter estimation simultaneously and it is compared with two conventional approaches, polynomial chaos expansion and ensemble Kalman filter, and a deep learning-based approach, deep Bayesian inversion. The method is shown to be more accurate than alternative approaches while also being computationally faster, in multiple test cases, including the important engineering setting of detecting leaks in pipelines

    A Probabilistic Digital Twin for Leak Localization in Water Distribution Networks Using Generative Deep Learning

    Get PDF
    Localizing leakages in large water distribution systems is an important and ever-present problem. Due to the complexity originating from water pipeline networks, too few sensors, and noisy measurements, this is a highly challenging problem to solve. In this work, we present a methodology based on generative deep learning and Bayesian inference for leak localization with uncertainty quantification. A generative model, utilizing deep neural networks, serves as a probabilistic surrogate model that replaces the full equations, while at the same time also incorporating the uncertainty inherent in such models. By embedding this surrogate model into a Bayesian inference scheme, leaks are located by combining sensor observations with a model output approximating the true posterior distribution for possible leak locations. We show that our methodology enables producing fast, accurate, and trustworthy results. It showed a convincing performance on three problems with increasing complexity. For a simple test case, the Hanoi network, the average topological distance (ATD) between the predicted and true leak location ranged from 0.3 to 3 with a varying number of sensors and level of measurement noise. For two more complex test cases, the ATD ranged from 0.75 to 4 and from 1.5 to 10, respectively. Furthermore, accuracies upwards of 83%, 72%, and 42% were achieved for the three test cases, respectively. The computation times ranged from 0.1 to 13 s, depending on the size of the neural network employed. This work serves as an example of a digital twin for a sophisticated application of advanced mathematical and deep learning techniques in the area of leak detection

    Markov chain generative adversarial neural networks for solving Bayesian inverse problems in physics applications

    Get PDF
    In the context of solving inverse problems for physics applications within a Bayesian framework, we present a new approach, the Markov Chain Generative Adversarial Neural Network (MCGAN), to alleviate the computational costs associated with solving the Bayesian inference problem. GANs pose a very suitable framework to aid in the solution of Bayesian inference problems, as they are designed to generate samples from complicated high-dimensional distributions. By training a GAN to sample from a low-dimensional latent space and then embedding it in a Markov Chain Monte Carlo method, we can highly efficiently sample from the posterior, by replacing both the high-dimensional prior and the expensive forward map. This comes at the cost of a potentially expensive offline stage in which training data must be simulated or gathered and the GAN has to be trained. We prove that the proposed methodology converges to the true posterior in the Wasserstein-1 distance and that sampling from the latent space is equivalent to sampling in the high-dimensional space in a weak sense. The method is showcased in two test cases where we perform both state and parameter estimation simultaneously and it is compared with two conventional approaches, polynomial chaos expansion and ensemble Kalman filter, and a deep learning-based approach, deep Bayesian inversion. The method is shown to be more accurate than alternative approaches while also being computationally faster, in multiple test cases, including the important engineering setting of detecting leaks in pipelines.</p

    Attention-gated brain propagation: How the brain can implement reward-based error backpropagation

    No full text
    Much recent work has focused on biologically plausible variants of supervised learning algorithms. However, there is no teacher in the motor cortex that instructs the motor neurons and learning in the brain depends on reward and punishment. We demonstrate a biologically plausible reinforcement learning scheme for deep networks with an arbitrary number of layers. The network chooses an action by selecting a unit in the output layer and uses feedback connections to assign credit to the units in successively lower layers that are responsible for this action. After the choice, the network receives reinforcement and there is no teacher correcting the errors. We show how the new learning scheme – Attention-Gated Brain Propagation (BrainProp) – is mathematically equivalent to error backpropagation, for one output unit at a time. We demonstrate successful learning of deep fully connected, convolutional and locally connected networks on classical and hard image-classification benchmarks; MNIST, CIFAR10, CIFAR100 and Tiny ImageNet. BrainProp achieves an accuracy that is equivalent to that of standard error-backpropagation, and better than state-of-the-art biologically inspired learning schemes. Additionally, the trial-and-error nature of learning is associated with limited additional training time so that BrainProp is a factor of 1-3.5 times slower. Our results thereby provide new insights into how deep learning may be implemented in the brain

    Mechanisms of human dynamic object recognition revealed by sequential deep neural networks

    Get PDF
    Humans can quickly recognize objects in a dynamically changing world. This ability is showcased by the fact that observers succeed at recognizing objects in rapidly changing image sequences, at up to 13 ms/image. To date, the mechanisms that govern dynamic object recognition remain poorly understood. Here, we developed deep learning models for dynamic recognition and compared different computational mechanisms, contrasting feedforward and recurrent, single-image and sequential processing as well as different forms of adaptation. We found that only models that integrate images sequentially via lateral recurrence mirrored human performance (N = 36) and were predictive of trial-by-trial responses across image durations (13–80 ms/image). Importantly, models with sequential lateral-recurrent integration also captured how human performance changes as a function of image presentation durations, with models processing images for a few time steps capturing human object recognition at shorter presentation durations and models processing images for more time steps capturing human object recognition at longer presentation durations. Furthermore, augmenting such a recurrent model with adaptation markedly improved dynamic recognition performance and accelerated its representational dynamics, thereby predicting human trial-by-trial responses using fewer processing resources. Together, these findings provide new insights into the mechanisms rendering object recognition so fast and effective in a dynamic visual world.</p
    corecore