9,976 research outputs found

    Nonphotolithographic nanoscale memory density prospects

    Get PDF
    Technologies are now emerging to construct molecular-scale electronic wires and switches using bottom-up self-assembly. This opens the possibility of constructing nanoscale circuits and memories where active devices are just a few nanometers square and wire pitches may be on the order of ten nanometers. The features can be defined at this scale without using photolithography. The available assembly techniques have relatively high defect rates compared to conventional lithographic integrated circuits and can only produce very regular structures. Nonetheless, with proper memory organization, it is reasonable to expect these technologies to provide memory densities in excess of 10/sup 11/ b/cm/sup 2/ with modest active power requirements under 0.6 W/Tb/s for random read operations

    MorphIC: A 65-nm 738k-Synapse/mm2^2 Quad-Core Binary-Weight Digital Neuromorphic Processor with Stochastic Spike-Driven Online Learning

    Full text link
    Recent trends in the field of neural network accelerators investigate weight quantization as a means to increase the resource- and power-efficiency of hardware devices. As full on-chip weight storage is necessary to avoid the high energy cost of off-chip memory accesses, memory reduction requirements for weight storage pushed toward the use of binary weights, which were demonstrated to have a limited accuracy reduction on many applications when quantization-aware training techniques are used. In parallel, spiking neural network (SNN) architectures are explored to further reduce power when processing sparse event-based data streams, while on-chip spike-based online learning appears as a key feature for applications constrained in power and resources during the training phase. However, designing power- and area-efficient spiking neural networks still requires the development of specific techniques in order to leverage on-chip online learning on binary weights without compromising the synapse density. In this work, we demonstrate MorphIC, a quad-core binary-weight digital neuromorphic processor embedding a stochastic version of the spike-driven synaptic plasticity (S-SDSP) learning rule and a hierarchical routing fabric for large-scale chip interconnection. The MorphIC SNN processor embeds a total of 2k leaky integrate-and-fire (LIF) neurons and more than two million plastic synapses for an active silicon area of 2.86mm2^2 in 65nm CMOS, achieving a high density of 738k synapses/mm2^2. MorphIC demonstrates an order-of-magnitude improvement in the area-accuracy tradeoff on the MNIST classification task compared to previously-proposed SNNs, while having no penalty in the energy-accuracy tradeoff.Comment: This document is the paper as accepted for publication in the IEEE Transactions on Biomedical Circuits and Systems journal (2019), the fully-edited paper is available at https://ieeexplore.ieee.org/document/876400

    Graded, Dynamically Routable Information Processing with Synfire-Gated Synfire Chains

    Full text link
    Coherent neural spiking and local field potentials are believed to be signatures of the binding and transfer of information in the brain. Coherent activity has now been measured experimentally in many regions of mammalian cortex. Synfire chains are one of the main theoretical constructs that have been appealed to to describe coherent spiking phenomena. However, for some time, it has been known that synchronous activity in feedforward networks asymptotically either approaches an attractor with fixed waveform and amplitude, or fails to propagate. This has limited their ability to explain graded neuronal responses. Recently, we have shown that pulse-gated synfire chains are capable of propagating graded information coded in mean population current or firing rate amplitudes. In particular, we showed that it is possible to use one synfire chain to provide gating pulses and a second, pulse-gated synfire chain to propagate graded information. We called these circuits synfire-gated synfire chains (SGSCs). Here, we present SGSCs in which graded information can rapidly cascade through a neural circuit, and show a correspondence between this type of transfer and a mean-field model in which gating pulses overlap in time. We show that SGSCs are robust in the presence of variability in population size, pulse timing and synaptic strength. Finally, we demonstrate the computational capabilities of SGSC-based information coding by implementing a self-contained, spike-based, modular neural circuit that is triggered by, then reads in streaming input, processes the input, then makes a decision based on the processed information and shuts itself down

    All-optical pulse reshaping and retiming systems incorporating pulse shaping fiber Bragg grating

    No full text
    This paper demonstrates two optical pulse retiming and reshaping systems incorporating superstructured fiber Bragg gratings (SSFBGs) as pulse shaping elements. A rectangular switching window is implemented to avoid conversion of the timing jitter on the original data pulses into pulse amplitude noise at the output of a nonlinear optical switch. In a first configuration, the rectangular pulse generator is used at the (low power) data input to a nonlinear optical loop mirror (NOLM) to perform retiming of an incident noisy data signal using a clean local clock signal to control the switch. In a second configuration, the authors further amplify the data signal and use it to switch a (low power) clean local clock signal. The S-shaped nonlinear characteristic of the NOLM results in this instance in a reduction of both timing and amplitude jitter on the data signal. The underlying technologies required for the implementation of this technique are such that an upgrade of the scheme for the regeneration of ultrahigh bit rate signals at data rates in excess of 320 Gb/s should be achievable

    A sub-mW IoT-endnode for always-on visual monitoring and smart triggering

    Full text link
    This work presents a fully-programmable Internet of Things (IoT) visual sensing node that targets sub-mW power consumption in always-on monitoring scenarios. The system features a spatial-contrast 128x64128\mathrm{x}64 binary pixel imager with focal-plane processing. The sensor, when working at its lowest power mode (10μW10\mu W at 10 fps), provides as output the number of changed pixels. Based on this information, a dedicated camera interface, implemented on a low-power FPGA, wakes up an ultra-low-power parallel processing unit to extract context-aware visual information. We evaluate the smart sensor on three always-on visual triggering application scenarios. Triggering accuracy comparable to RGB image sensors is achieved at nominal lighting conditions, while consuming an average power between 193μW193\mu W and 277μW277\mu W, depending on context activity. The digital sub-system is extremely flexible, thanks to a fully-programmable digital signal processing engine, but still achieves 19x lower power consumption compared to MCU-based cameras with significantly lower on-board computing capabilities.Comment: 11 pages, 9 figures, submitteted to IEEE IoT Journa

    YodaNN: An Architecture for Ultra-Low Power Binary-Weight CNN Acceleration

    Get PDF
    Convolutional neural networks (CNNs) have revolutionized the world of computer vision over the last few years, pushing image classification beyond human accuracy. The computational effort of today's CNNs requires power-hungry parallel processors or GP-GPUs. Recent developments in CNN accelerators for system-on-chip integration have reduced energy consumption significantly. Unfortunately, even these highly optimized devices are above the power envelope imposed by mobile and deeply embedded applications and face hard limitations caused by CNN weight I/O and storage. This prevents the adoption of CNNs in future ultra-low power Internet of Things end-nodes for near-sensor analytics. Recent algorithmic and theoretical advancements enable competitive classification accuracy even when limiting CNNs to binary (+1/-1) weights during training. These new findings bring major optimization opportunities in the arithmetic core by removing the need for expensive multiplications, as well as reducing I/O bandwidth and storage. In this work, we present an accelerator optimized for binary-weight CNNs that achieves 1510 GOp/s at 1.2 V on a core area of only 1.33 MGE (Million Gate Equivalent) or 0.19 mm2^2 and with a power dissipation of 895 {\mu}W in UMC 65 nm technology at 0.6 V. Our accelerator significantly outperforms the state-of-the-art in terms of energy and area efficiency achieving 61.2 TOp/s/[email protected] V and 1135 GOp/s/[email protected] V, respectively

    Data-Width-Driven Power Gating of Integer Arithmetic Circuits

    Get PDF
    When performing narrow-width computations, power gating of unused arithmetic circuit portions can significantly reduce leakage power. We deploy coarse-grain power gating in 32-bit integer arithmetic circuits that frequently will operate on narrow-width data. Our contributions include a design framework that automatically implements coarse-grain power-gated arithmetic circuits considering a narrow-width input data mode, and an analysis of the impact of circuit architecture on the efficiency of this data-width-driven power gating scheme. As an example, with a performance penalty of 6.7%, coarse-grain power gating of a 45-nm 32-bit multiplier is demonstrated to yield an 11.6x static leakage energy reduction per 8x8-bit operation
    corecore