35 research outputs found

    NengoFPGA: an FPGA Backend for the Nengo Neural Simulator

    Get PDF
    Low-power, high-speed neural networks are critical for providing deployable embedded AI applications at the edge. We describe a Xilinx FPGA implementation of Neural Engineering Framework (NEF) networks with online learning that outperforms mobile Nvidia GPU implementations by an order of magnitude or more. Specifically, we provide an embedded Python-capable PYNQ FPGA implementation supported with a Xilinx Vivado High-Level Synthesis (HLS) workflow that allows sub-millisecond implementation of adaptive neural networks with low-latency, direct I/O access to the physical world. The outcome of this work is NengoFPGA, a seamless and user-friendly extension to the neural compiler Python package Nengo. To reduce memory requirements and improve performance we tune the precision of the different intermediate variables in the code to achieve competitive absolute accuracy against slower and larger floating-point reference designs. The online learning component of the neural network exploits immediate feedback to adjust the network weights to best support a given arithmetic precision. As the space of possible design configurations of such quantized networks is vast and is subject to a target accuracy constraint, we use the Hyperopt hyper-parameter tuning tool instead of manual search to find Pareto optimal designs. Specifically, we are able to generate the optimized designs in under 500 short iterations of Vivado HLS C synthesis before running the complete Vivado place-and-route phase on that subset, a much longer process not conducive to rapid exploration. For neural network populations of 64–4096 neurons and 1–8 representational dimensions our optimized FPGA implementation generated by Hyperopt has a speedup of 10–484× over a competing cuBLAS implementation on the Jetson TX1 GPU while using 2.4–9.5× less power. Our speedups are a result of HLS-specific reformulation (15× improvement), precision adaptation (3× improvement), and low-latency direct I/O access (1000× improvement)

    Embodied neuromorphic intelligence

    Full text link
    The design of robots that interact autonomously with the environment and exhibit complex behaviours is an open challenge that can benefit from understanding what makes living beings fit to act in the world. Neuromorphic engineering studies neural computational principles to develop technologies that can provide a computing substrate for building compact and low-power processing systems. We discuss why endowing robots with neuromorphic technologies – from perception to motor control – represents a promising approach for the creation of robots which can seamlessly integrate in society. We present initial attempts in this direction, highlight open challenges, and propose actions required to overcome current limitations

    Parallel computing for brain simulation

    Get PDF
    [Abstract] Background: The human brain is the most complex system in the known universe, it is therefore one of the greatest mysteries. It provides human beings with extraordinary abilities. However, until now it has not been understood yet how and why most of these abilities are produced. Aims: For decades, researchers have been trying to make computers reproduce these abilities, focusing on both understanding the nervous system and, on processing data in a more efficient way than before. Their aim is to make computers process information similarly to the brain. Important technological developments and vast multidisciplinary projects have allowed creating the first simulation with a number of neurons similar to that of a human brain. Conclusion: This paper presents an up-to-date review about the main research projects that are trying to simulate and/or emulate the human brain. They employ different types of computational models using parallel computing: digital models, analog models and hybrid models. This review includes the current applications of these works, as well as future trends. It is focused on various works that look for advanced progress in Neuroscience and still others which seek new discoveries in Computer Science (neuromorphic hardware, machine learning techniques). Their most outstanding characteristics are summarized and the latest advances and future plans are presented. In addition, this review points out the importance of considering not only neurons: Computational models of the brain should also include glial cells, given the proven importance of astrocytes in information processing.Galicia. ConsellerĂ­a de Cultura, EducaciĂłn e OrdenaciĂłn Universitaria; GRC2014/049Galicia. ConsellerĂ­a de Cultura, EducaciĂłn e OrdenaciĂłn Universitaria; R2014/039Instituto de Salud Carlos III; PI13/0028

    Dynamical Systems in Spiking Neuromorphic Hardware

    Get PDF
    Dynamical systems are universal computers. They can perceive stimuli, remember, learn from feedback, plan sequences of actions, and coordinate complex behavioural responses. The Neural Engineering Framework (NEF) provides a general recipe to formulate models of such systems as coupled sets of nonlinear differential equations and compile them onto recurrently connected spiking neural networks – akin to a programming language for spiking models of computation. The Nengo software ecosystem supports the NEF and compiles such models onto neuromorphic hardware. In this thesis, we analyze the theory driving the success of the NEF, and expose several core principles underpinning its correctness, scalability, completeness, robustness, and extensibility. We also derive novel theoretical extensions to the framework that enable it to far more effectively leverage a wide variety of dynamics in digital hardware, and to exploit the device-level physics in analog hardware. At the same time, we propose a novel set of spiking algorithms that recruit an optimal nonlinear encoding of time, which we call the Delay Network (DN). Backpropagation across stacked layers of DNs dramatically outperforms stacked Long Short-Term Memory (LSTM) networks—a state-of-the-art deep recurrent architecture—in accuracy and training time, on a continuous-time memory task, and a chaotic time-series prediction benchmark. The basic component of this network is shown to function on state-of-the-art spiking neuromorphic hardware including Braindrop and Loihi. This implementation approaches the energy-efficiency of the human brain in the former case, and the precision of conventional computation in the latter case

    Algorithm Hardware Codesign for High Performance Neuromorphic Computing

    Get PDF
    Driven by the massive application of Internet of Things (IoT), embedded system and Cyber Physical System (CPS) etc., there is an increasing demand to apply machine intelligence on these power limited scenarios. Though deep learning has achieved impressive performance on various realistic and practical tasks such as anomaly detection, pattern recognition, machine vision etc., the ever-increasing computational complexity and model size of Deep Neural Networks (DNN) make it challenging to deploy them onto aforementioned scenarios where computation, memory and energy resource are all limited. Early studies show that biological systems\u27 energy efficiency can be orders of magnitude higher than that of digital systems. Hence taking inspiration from biological systems, neuromorphic computing and Spiking Neural Network (SNN) have drawn attention as alternative solutions for energy-efficient machine intelligence. Though believed promising, neuromorphic computing are hardly used for real world applications. A major problem is that the performance of SNN is limited compared with DNNs due to the lack of efficient training algorithm. In SNN, neuron\u27s output is spike, which is represented by Dirac Delta function mathematically. Becauase of the non-differentiable nature of spike, gradient descent cannot be directly used to train SNN. Hence algorithm level innovation is desirable. Next, as an emerging computing paradigm, hardware and architecture level innovation is also required to support new algorithms and to explore the potential of neuromorphic computing. In this work, we present a comprehensive algorithm-hardware codesign for neuromorphic computing. On the algorithm side, we address the training difficulty. We first derive a flexible SNN model that retains critical neural dynamics, and then develop algorithm to train SNN to learn temporal patterns. Next, we apply proposed algorithm to multivariate time series classification tasks to demonstrate its advantages. On hardware level, we develop a systematic solution on FPGA that is optimized for proposed SNN model to enable high performance inference. In addition, we also explore emerging devices, a memristor-based neuromorphic design is proposed. We carry out a neuron and synapse circuit which can replicate the important neural dynamics such as filter effect and adaptive threshold

    The hippocampal formation from a machine learning perspective

    Get PDF
    Nos dias de hoje, existem diversos tipos de sensores que conseguem captar uma grande quantidade de dados em curtos espaços de tempo. Em muitas situações, as informações obtidas pelos diferentes sensores traduzem fenómenos específicos, através de dados obtidos em diferentes formatos. Nesses casos, torna-se difícil saber quais as relações entre os dados e/ou identificar se os diferentes dados traduzem uma certa condição. Neste contexto, torna-se relevante desenvolver sistemas que tenham capacidade de analisar grandes quantidades de dados num menor tempo possível, produzindo informação válida a partir da informação recolhida. O cérebro dos animais é um órgão biológico capaz de fazer algo semelhante com a informação obtida pelos sentidos, que traduzem fenómenos específicos. Dentro do cérebro, existe um elemento chamado Hipocampo, que se encontra situado na área do lóbulo temporal. A sua função principal consiste em analisar os elementos previamente codificados pelo Entorhinal Cortex, dando origem à formação de novas memórias. Sendo o Hipocampo um órgão que foi sofrendo evoluções ao longo do tempos, é importante perceber qual é o seu funcionamento e, se possível, tentar encontrar modelos computacionais que traduzam o seu mecanismo. Desde a remoção do Hipocampo num paciente que sofria de convulsões, ficou claro que, sem esse elemento, não seria possível memorizar lugares ou eventos ocorridos num determinado espaço de tempo. Essa funcionalidade é obtida através de um conjunto específico de células chamadas de Grid Cells, que estão situadas na área do Entorhinal Cortex, mas também das Place Cells, Head Direction Cells e Boundary Vector Cells. Neste âmbito, o principal objetivo desta Dissertação consiste em descrever os principais mecanismos biológicos localizados no Hipocampo e definir modelos computacionais que consigam simular as funções mais críticas de ambos os Hipocampos e da área do Entorhinal Cortex.Nowadays, sensor devices are able to generate huge amounts of data in short periods of time. In many situations, that data, collected by many different sensors, translates a specific phenomenon, but is presented in very different types and formats. In these cases, it is hard to determine how these distinct types of data are related to each other or translate a certain condition. In this context, it would be of great importance to develop a system capable of analysing such data in the smallest amount time to produce valid information. The brain is a biological organ capable of such decisions. Inside the brain, there is an element called Hippocampus, that is situated in the Temporal Lobe area. Its main function is to analyse the sensorial data encoded by the Entorhinal Cortex to create new memories. Since the Hippocampus has evolved for thousands of years to perform these tasks, it is of high importance to try to understand its functioning and to model it, i.e. to define a set of computer algorithms that approximates it. Since the removal of the Hippocampus from a patient suffering from seizures, the scientific community believes that the Hippocampus is crucial for memory formation and for spatial navigation. Without it, it wouldn’t be possible to memorize places and events that happened in a specific time or place. Such functionality is achieved with the help of set of cells called Grid Cells, present in the Entorhinal Cortex area, but also with Place Cells, Head Direction Cells and Boundary Vector Cells. The combined information analysed by those cells allows the unique identification of places or events. The main objective of the work developed in this Thesis consists in describing the biological mechanisms present in the Hippocampus area and to define potential computer models that allow the simulation of all or the most critical functions of both the Hippocampus and the Entorhinal Cortex areas

    Reservoir based spiking models for univariate Time Series Classification

    Get PDF
    A variety of advanced machine learning and deep learning algorithms achieve state-of-the-art performance on various temporal processing tasks. However, these methods are heavily energy inefficient—they run mainly on the power hungry CPUs and GPUs. Computing with Spiking Networks, on the other hand, has shown to be energy efficient on specialized neuromorphic hardware, e.g., Loihi, TrueNorth, SpiNNaker, etc. In this work, we present two architectures of spiking models, inspired from the theory of Reservoir Computing and Legendre Memory Units, for the Time Series Classification (TSC) task. Our first spiking architecture is closer to the general Reservoir Computing architecture and we successfully deploy it on Loihi; the second spiking architecture differs from the first by the inclusion of non-linearity in the readout layer. Our second model (trained with Surrogate Gradient Descent method) shows that non-linear decoding of the linearly extracted temporal features through spiking neurons not only achieves promising results, but also offers low computation-overhead by significantly reducing the number of neurons compared to the popular LSM based models—more than 40x reduction with respect to the recent spiking model we compare with. We experiment on five TSC datasets and achieve new SoTA spiking results (—as much as 28.607% accuracy improvement on one of the datasets), thereby showing the potential of our models to address the TSC tasks in a green energy-efficient manner. In addition, we also do energy profiling and comparison on Loihi and CPU to support our claims
    corecore