591 research outputs found

    Autonomous Probabilistic Coprocessing with Petaflips per Second

    Full text link
    In this paper we present a concrete design for a probabilistic (p-) computer based on a network of p-bits, robust classical entities fluctuating between -1 and +1, with probabilities that are controlled through an input constructed from the outputs of other p-bits. The architecture of this probabilistic computer is similar to a stochastic neural network with the p-bit playing the role of a binary stochastic neuron, but with one key difference: there is no sequencer used to enforce an ordering of p-bit updates, as is typically required. Instead, we explore \textit{sequencerless} designs where all p-bits are allowed to flip autonomously and demonstrate that such designs can allow ultrafast operation unconstrained by available clock speeds without compromising the solution's fidelity. Based on experimental results from a hardware benchmark of the autonomous design and benchmarked device models, we project that a nanomagnetic implementation can scale to achieve petaflips per second with millions of neurons. A key contribution of this paper is the focus on a hardware metric −- flips per second −- as a problem and substrate-independent figure-of-merit for an emerging class of hardware annealers known as Ising Machines. Much like the shrinking feature sizes of transistors that have continually driven Moore's Law, we believe that flips per second can be continually improved in later technology generations of a wide class of probabilistic, domain specific hardware.Comment: 13 pages, 8 figures, 1 tabl

    Brain-inspired nanophotonic spike computing:challenges and prospects

    Get PDF
    Nanophotonic spiking neural networks (SNNs) based on neuron-like excitable subwavelength (submicrometre) devices are of key importance for realizing brain-inspired, power-efficient artificial intelligence (AI) systems with high degree of parallelism and energy efficiency. Despite significant advances in neuromorphic photonics, compact and efficient nanophotonic elements for spiking signal emission and detection, as required for spike-based computation, remain largely unexplored. In this invited perspective, we outline the main challenges, early achievements, and opportunities toward a key-enabling photonic neuro-architecture using III-V/Si integrated spiking nodes based on nanoscale resonant tunnelling diodes (nanoRTDs) with folded negative differential resistance. We utilize nanoRTDs as nonlinear artificial neurons capable of spiking at high-speeds. We discuss the prospects for monolithic integration of nanoRTDs with nanoscale light-emitting diodes and nanolaser diodes, and nanophotodetectors to realize neuron emitter and receiver spiking nodes, respectively. Such layout would have a small footprint, fast operation, and low power consumption, all key requirements for efficient nano-optoelectronic spiking operation. We discuss how silicon photonics interconnects, integrated photorefractive interconnects, and 3D waveguide polymeric interconnections can be used for interconnecting the emitter-receiver spiking photonic neural nodes. Finally, using numerical simulations of artificial neuron models, we present spike-based spatio-temporal learning methods for applications in relevant AI-based functional tasks, such as image pattern recognition, edge detection, and SNNs for inference and learning. Future developments in neuromorphic spiking photonic nanocircuits, as outlined here, will significantly boost the processing and transmission capabilities of next-generation nanophotonic spike-based neuromorphic architectures for energy-efficient AI applications. This perspective paper is a result of the European Union funded research project ChipAI in the frame of the Horizon 2020 Future and Emerging Technologies Open programme.</p

    Neuromorphic electronic systems

    Get PDF
    Biological in formation-processing systems operate on completely different principles from those with which most engineers are familiar. For many problems, particularly those in which the input data are ill-conditioned and the computation can be specified in a relative manner, biological solutions are many orders of magnitude more effective than those we have been able to implement using digital methods. This advantage can be attributed principally to the use of elementary physical phenomena as computational primitives, and to the representation of information by the relative values of analog signals, rather than by the absolute values of digital signals. This approach requires adaptive techniques to mitigate the effects of component differences. This kind of adaptation leads naturally to systems that learn about their environment. Large-scale adaptive analog systems are more robust to component degradation and failure than are more conventional systems, and they use far less power. For this reason, adaptive analog technology can be expected to utilize the full potential of wafer-scale silicon fabrication

    Neural networks-on-chip for hybrid bio-electronic systems

    Get PDF
    PhD ThesisBy modelling the brains computation we can further our understanding of its function and develop novel treatments for neurological disorders. The brain is incredibly powerful and energy e cient, but its computation does not t well with the traditional computer architecture developed over the previous 70 years. Therefore, there is growing research focus in developing alternative computing technologies to enhance our neural modelling capability, with the expectation that the technology in itself will also bene t from increased awareness of neural computational paradigms. This thesis focuses upon developing a methodology to study the design of neural computing systems, with an emphasis on studying systems suitable for biomedical experiments. The methodology allows for the design to be optimized according to the application. For example, di erent case studies highlight how to reduce energy consumption, reduce silicon area, or to increase network throughput. High performance processing cores are presented for both Hodgkin-Huxley and Izhikevich neurons incorporating novel design features. Further, a complete energy/area model for a neural-network-on-chip is derived, which is used in two exemplar case-studies: a cortical neural circuit to benchmark typical system performance, illustrating how a 65,000 neuron network could be processed in real-time within a 100mW power budget; and a scalable highperformance processing platform for a cerebellar neural prosthesis. From these case-studies, the contribution of network granularity towards optimal neural-network-on-chip performance is explored

    Efficient hardware implementations of bio-inspired networks

    Get PDF
    The human brain, with its massive computational capability and power efficiency in small form factor, continues to inspire the ultimate goal of building machines that can perform tasks without being explicitly programmed. In an effort to mimic the natural information processing paradigms observed in the brain, several neural network generations have been proposed over the years. Among the neural networks inspired by biology, second-generation Artificial or Deep Neural Networks (ANNs/DNNs) use memoryless neuron models and have shown unprecedented success surpassing humans in a wide variety of tasks. Unlike ANNs, third-generation Spiking Neural Networks (SNNs) closely mimic biological neurons by operating on discrete and sparse events in time called spikes, which are obtained by the time integration of previous inputs. Implementation of data-intensive neural network models on computers based on the von Neumann architecture is mainly limited by the continuous data transfer between the physically separated memory and processing units. Hence, non-von Neumann architectural solutions are essential for processing these memory-intensive bio-inspired neural networks in an energy-efficient manner. Among the non-von Neumann architectures, implementations employing non-volatile memory (NVM) devices are most promising due to their compact size and low operating power. However, it is non-trivial to integrate these nanoscale devices on conventional computational substrates due to their non-idealities, such as limited dynamic range, finite bit resolution, programming variability, etc. This dissertation demonstrates the architectural and algorithmic optimizations of implementing bio-inspired neural networks using emerging nanoscale devices. The first half of the dissertation focuses on the hardware acceleration of DNN implementations. A 4-layer stochastic DNN in a crossbar architecture with memristive devices at the cross point is analyzed for accelerating DNN training. This network is then used as a baseline to explore the impact of experimental memristive device behavior on network performance. Programming variability is found to have a critical role in determining network performance compared to other non-ideal characteristics of the devices. In addition, noise-resilient inference engines are demonstrated using stochastic memristive DNNs with 100 bits for stochastic encoding during inference and 10 bits for the expensive training. The second half of the dissertation focuses on a novel probabilistic framework for SNNs using the Generalized Linear Model (GLM) neurons for capturing neuronal behavior. This work demonstrates that probabilistic SNNs have comparable perform-ance against equivalent ANNs on two popular benchmarks - handwritten-digit classification and human activity recognition. Considering the potential of SNNs in energy-efficient implementations, a hardware accelerator for inference is proposed, termed as Spintronic Accelerator for Probabilistic SNNs (SpinAPS). The learning algorithm is optimized for a hardware friendly implementation and uses first-to-spike decoding scheme for low latency inference. With binary spintronic synapses and digital CMOS logic neurons for computations, SpinAPS achieves a performance improvement of 4x in terms of GSOPS/W/mm2^2 when compared to a conventional SRAM-based design. Collectively, this work demonstrates the potential of emerging memory technologies in building energy-efficient hardware architectures for deep and spiking neural networks. The design strategies adopted in this work can be extended to other spike and non-spike based systems for building embedded solutions having power/energy constraints

    Mixed-precision deep learning based on computational memory

    Full text link
    Deep neural networks (DNNs) have revolutionized the field of artificial intelligence and have achieved unprecedented success in cognitive tasks such as image and speech recognition. Training of large DNNs, however, is computationally intensive and this has motivated the search for novel computing architectures targeting this application. A computational memory unit with nanoscale resistive memory devices organized in crossbar arrays could store the synaptic weights in their conductance states and perform the expensive weighted summations in place in a non-von Neumann manner. However, updating the conductance states in a reliable manner during the weight update process is a fundamental challenge that limits the training accuracy of such an implementation. Here, we propose a mixed-precision architecture that combines a computational memory unit performing the weighted summations and imprecise conductance updates with a digital processing unit that accumulates the weight updates in high precision. A combined hardware/software training experiment of a multilayer perceptron based on the proposed architecture using a phase-change memory (PCM) array achieves 97.73% test accuracy on the task of classifying handwritten digits (based on the MNIST dataset), within 0.6% of the software baseline. The architecture is further evaluated using accurate behavioral models of PCM on a wide class of networks, namely convolutional neural networks, long-short-term-memory networks, and generative-adversarial networks. Accuracies comparable to those of floating-point implementations are achieved without being constrained by the non-idealities associated with the PCM devices. A system-level study demonstrates 173x improvement in energy efficiency of the architecture when used for training a multilayer perceptron compared with a dedicated fully digital 32-bit implementation

    Stochastic arrays and learning networks

    Get PDF
    This thesis presents a study of stochastic arrays and learning networks. These arrays will be shown to consist of simple elements utilising probabilistic coding techniques which may interact with a random and noisy environment to produce useful results. Such networks have generated considerable interest since it is possible to design large parallel self-organising arrays of these elements which are trained by example rather than explicit instruction. Once the learning process has been completed, they then have the potential ability to form generalisations, perform global optimisation of traditionally difficult problems such as routing and incorporate an associative memory capability which can enable such tasks as image recognition and reconstruction to be performed, even when given a partial or noisy view of the target. Since the method of operation of such elements is thought to emulate the basic properties of the neurons of the brain, these arrays have been termed neural 'networks. The research demonstrates the use of stochastic elements for digital signal processing by presenting a novel systolic array, utilising a simple, replicated cell structure, which is shown to perform the operations of Cyclic Correlation and the Discrete Fourier Transform on inherently random and noisy probabilistic single bit inputs. This work is then extended into the field of stochastic learning automata and to neural networks by examining the Associative Reward-Punish (A(_R-P)) pattern recognising learning automaton. The thesis concludes that all the networks described may potentially be generalised to simple variations of one standard probabilistic element utilising stochastic coding, whose properties resemble those of biological neurons. A novel study is presented which describes how a powerful deterministic algorithm, previously considered to be biologically unviable due to its nature, may be represented in this way. It is expected that combinations of these methods may lead to a series of useful hybrid techniques for training networks. The nature of the element generalisation is particularly important as it reveals the potential for encoding successful algorithms in cheap, simple hardware with single bit interconnections. No claim is made that the particular algorithms described are those actually utilised by the brain, only to demonstrate that those properties observed of biological neurons are capable of endowing collective computational ability and that actual biological algorithms may perhaps then become apparent when viewed in this light
    • …
    corecore