49 research outputs found

    Quantized Neural Networks and Neuromorphic Computing for Embedded Systems

    Get PDF
    Deep learning techniques have made great success in areas such as computer vision, speech recognition and natural language processing. Those breakthroughs made by deep learning techniques are changing every aspect of our lives. However, deep learning techniques have not realized their full potential in embedded systems such as mobiles, vehicles etc. because the high performance of deep learning techniques comes at the cost of high computation resource and energy consumption. Therefore, it is very challenging to deploy deep learning models in embedded systems because such systems have very limited computation resources and power constraints. Extensive research on deploying deep learning techniques in embedded systems has been conducted and considerable progress has been made. In this book chapter, we are going to introduce two approaches. The first approach is model compression, which is one of the very popular approaches proposed in recent years. Another approach is neuromorphic computing, which is a novel computing system that mimicks the human brain

    Opening the “Black Box” of Silicon Chip Design in Neuromorphic Computing

    Get PDF
    Neuromorphic computing, a bio-inspired computing architecture that transfers neuroscience to silicon chip, has potential to achieve the same level of computation and energy efficiency as mammalian brains. Meanwhile, three-dimensional (3D) integrated circuit (IC) design with non-volatile memory crossbar array uniquely unveils its intrinsic vector-matrix computation with parallel computing capability in neuromorphic computing designs. In this chapter, the state-of-the-art research trend on electronic circuit designs of neuromorphic computing will be introduced. Furthermore, a practical bio-inspired spiking neural network with delay-feedback topology will be discussed. In the endeavor to imitate how human beings process information, our fabricated spiking neural network chip has capability to process analog signal directly, resulting in high energy efficiency with small hardware implementation cost. Mimicking the neurological structure of mammalian brains, the potential of 3D-IC implementation technique with memristive synapses is investigated. Finally, applications on the chaotic time series prediction and the video frame recognition will be demonstrated

    Principles of Neuromorphic Photonics

    Full text link
    In an age overrun with information, the ability to process reams of data has become crucial. The demand for data will continue to grow as smart gadgets multiply and become increasingly integrated into our daily lives. Next-generation industries in artificial intelligence services and high-performance computing are so far supported by microelectronic platforms. These data-intensive enterprises rely on continual improvements in hardware. Their prospects are running up against a stark reality: conventional one-size-fits-all solutions offered by digital electronics can no longer satisfy this need, as Moore's law (exponential hardware scaling), interconnection density, and the von Neumann architecture reach their limits. With its superior speed and reconfigurability, analog photonics can provide some relief to these problems; however, complex applications of analog photonics have remained largely unexplored due to the absence of a robust photonic integration industry. Recently, the landscape for commercially-manufacturable photonic chips has been changing rapidly and now promises to achieve economies of scale previously enjoyed solely by microelectronics. The scientific community has set out to build bridges between the domains of photonic device physics and neural networks, giving rise to the field of \emph{neuromorphic photonics}. This article reviews the recent progress in integrated neuromorphic photonics. We provide an overview of neuromorphic computing, discuss the associated technology (microelectronic and photonic) platforms and compare their metric performance. We discuss photonic neural network approaches and challenges for integrated neuromorphic photonic processors while providing an in-depth description of photonic neurons and a candidate interconnection architecture. We conclude with a future outlook of neuro-inspired photonic processing.Comment: 28 pages, 19 figure

    Architectures and Design of VLSI Machine Learning Systems

    Get PDF
    Quintillions of bytes of data are generated every day in this era of big data. Machine learning techniques are utilized to perform predictive analysis on these data, to reveal hidden relationships and dependencies and perform predictions of outcomes and behaviors. The obtained predictive models are used to interpret the existing data and predict new data information. Nowadays, most machine learning algorithms are realized by software programs running on general-purpose processors, which usually takes a huge amount of CPU time and introduces unbelievably high energy consumption. In comparison, a dedicated hardware design is usually much more efficient than software programs running on general-purpose processors in terms of runtime and energy consumption. Therefore, the objective of this dissertation is to develop efficient hardware architectures for mainstream machine learning algorithms, to provide a promising solution to addressing the runtime and energy bottlenecks of machine learning applications. However, it is a really challenging task to map complex machine learning algorithms to efficient hardware architectures. In fact, many important design decisions need to be made during the hardware development for efficient tradeoffs. In this dissertation, a parallel digital VLSI architecture for combined SVM training and classification is proposed. For the first time, cascade SVM, a powerful training algorithm, is leveraged to significantly improve the scalability of hardware-based SVM training and develop an efficient parallel VLSI architecture. The parallel SVM processors provide a significant training time speedup and energy reduction compared with the software SVM algorithm running on a general-purpose CPU. Furthermore, a liquid state machine based neuromorphic learning processor with integrated training and recognition is proposed. A novel theoretical measure of computational power is proposed to facilitate fast design space exploration of the recurrent reservoir. Three low-power techniques are proposed to improve the energy efficiency. Meanwhile, a 2-layer spiking neural network with global inhibition is realized on Silicon. In addition, we also present architectural design exploration of a brain-inspired digital neuromorphic processor architecture with memristive synaptic crossbar array, and highlight several synaptic memory access styles. Various analog-to-digital converter schemes have been investigated to provide new insights into the tradeoff between the hardware cost and energy consumption

    An Event-Based Digital Time Difference Encoder Model Implementation for Neuromorphic Systems

    Get PDF
    Neuromorphic systems are a viable alternative to conventional systems for real-time tasks with constrained resources. Their low power consumption, compact hardware realization, and low-latency response characteristics are the key ingredients of such systems. Furthermore, the event-based signal processing approach can be exploited for reducing the computational load and avoiding data loss due to its inherently sparse representation of sensed data and adaptive sampling time. In event-based systems, the information is commonly coded by the number of spikes within a specific temporal window. However, the temporal information of event-based signals can be difficult to extract when using rate coding. In this work, we present a novel digital implementation of the model, called time difference encoder (TDE), for temporal encoding on event-based signals, which translates the time difference between two consecutive input events into a burst of output events. The number of output events along with the time between them encodes the temporal information. The proposed model has been implemented as a digital circuit with a configurable time constant, allowing it to be used in a wide range of sensing tasks that require the encoding of the time difference between events, such as optical flow-based obstacle avoidance, sound source localization, and gas source localization. This proposed bioinspired model offers an alternative to the Jeffress model for the interaural time difference estimation, which is validated in this work with a sound source lateralization proof-of-concept system. The model was simulated and implemented on a field-programmable gate array (FPGA), requiring 122 slice registers of hardware resources and less than 1 mW of power consumption.Ministerio de Economía y Competitividad TEC2016-77785-P (COFNET)Agencia Estatal de Investigación PID2019-105556GB-C33/AEI/10.13039/501100011033 (MINDROB

    Spatiotemporal Spike-Pattern Selectivity in Single Mixed-Signal Neurons with Balanced Synapses

    Full text link
    Realizing the potential of mixed-signal neuromorphic processors for ultra-low-power inference and learning requires efficient use of their inhomogeneous analog circuitry as well as sparse, time-based information encoding and processing. Here, we investigate spike-timing-based spatiotemporal receptive fields of output-neurons in the Spatiotemporal Correlator (STC) network, for which we used excitatory-inhibitory balanced disynaptic inputs instead of dedicated axonal or neuronal delays. We present hardware-in-the-loop experiments with a mixed-signal DYNAP-SE neuromorphic processor, in which five-dimensional receptive fields of hardware neurons were mapped by randomly sampling input spike-patterns from a uniform distribution. We find that, when the balanced disynaptic elements are randomly programmed, some of the neurons display distinct receptive fields. Furthermore, we demonstrate how a neuron was tuned to detect a particular spatiotemporal feature, to which it initially was non-selective, by activating a different subset of the inhomogeneous analog synaptic circuits. The energy dissipation of the balanced synaptic elements is one order of magnitude lower per lateral connection (0.65 nJ vs 9.3 nJ per spike) than former delay-based neuromorphic hardware implementations. Thus, we show how the inhomogeneous synaptic circuits could be utilized for resource-efficient implementation of STC network layers, in a way that enables synapse-address reprogramming as a discrete mechanism for feature tuning.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    Neural Network Methods for Radiation Detectors and Imaging

    Full text link
    Recent advances in image data processing through machine learning and especially deep neural networks (DNNs) allow for new optimization and performance-enhancement schemes for radiation detectors and imaging hardware through data-endowed artificial intelligence. We give an overview of data generation at photon sources, deep learning-based methods for image processing tasks, and hardware solutions for deep learning acceleration. Most existing deep learning approaches are trained offline, typically using large amounts of computational resources. However, once trained, DNNs can achieve fast inference speeds and can be deployed to edge devices. A new trend is edge computing with less energy consumption (hundreds of watts or less) and real-time analysis potential. While popularly used for edge computing, electronic-based hardware accelerators ranging from general purpose processors such as central processing units (CPUs) to application-specific integrated circuits (ASICs) are constantly reaching performance limits in latency, energy consumption, and other physical constraints. These limits give rise to next-generation analog neuromorhpic hardware platforms, such as optical neural networks (ONNs), for high parallel, low latency, and low energy computing to boost deep learning acceleration
    corecore