5,747 research outputs found

    MorphIC: A 65-nm 738k-Synapse/mm2^2 Quad-Core Binary-Weight Digital Neuromorphic Processor with Stochastic Spike-Driven Online Learning

    Full text link
    Recent trends in the field of neural network accelerators investigate weight quantization as a means to increase the resource- and power-efficiency of hardware devices. As full on-chip weight storage is necessary to avoid the high energy cost of off-chip memory accesses, memory reduction requirements for weight storage pushed toward the use of binary weights, which were demonstrated to have a limited accuracy reduction on many applications when quantization-aware training techniques are used. In parallel, spiking neural network (SNN) architectures are explored to further reduce power when processing sparse event-based data streams, while on-chip spike-based online learning appears as a key feature for applications constrained in power and resources during the training phase. However, designing power- and area-efficient spiking neural networks still requires the development of specific techniques in order to leverage on-chip online learning on binary weights without compromising the synapse density. In this work, we demonstrate MorphIC, a quad-core binary-weight digital neuromorphic processor embedding a stochastic version of the spike-driven synaptic plasticity (S-SDSP) learning rule and a hierarchical routing fabric for large-scale chip interconnection. The MorphIC SNN processor embeds a total of 2k leaky integrate-and-fire (LIF) neurons and more than two million plastic synapses for an active silicon area of 2.86mm2^2 in 65nm CMOS, achieving a high density of 738k synapses/mm2^2. MorphIC demonstrates an order-of-magnitude improvement in the area-accuracy tradeoff on the MNIST classification task compared to previously-proposed SNNs, while having no penalty in the energy-accuracy tradeoff.Comment: This document is the paper as accepted for publication in the IEEE Transactions on Biomedical Circuits and Systems journal (2019), the fully-edited paper is available at https://ieeexplore.ieee.org/document/876400

    XNOR Neural Engine: a Hardware Accelerator IP for 21.6 fJ/op Binary Neural Network Inference

    Full text link
    Binary Neural Networks (BNNs) are promising to deliver accuracy comparable to conventional deep neural networks at a fraction of the cost in terms of memory and energy. In this paper, we introduce the XNOR Neural Engine (XNE), a fully digital configurable hardware accelerator IP for BNNs, integrated within a microcontroller unit (MCU) equipped with an autonomous I/O subsystem and hybrid SRAM / standard cell memory. The XNE is able to fully compute convolutional and dense layers in autonomy or in cooperation with the core in the MCU to realize more complex behaviors. We show post-synthesis results in 65nm and 22nm technology for the XNE IP and post-layout results in 22nm for the full MCU indicating that this system can drop the energy cost per binary operation to 21.6fJ per operation at 0.4V, and at the same time is flexible and performant enough to execute state-of-the-art BNN topologies such as ResNet-34 in less than 2.2mJ per frame at 8.9 fps.Comment: 11 pages, 8 figures, 2 tables, 3 listings. Accepted for presentation at CODES'18 and for publication in IEEE Transactions on Computer-Aided Design of Circuits and Systems (TCAD) as part of the ESWEEK-TCAD special issu

    Criticality Aware Soft Error Mitigation in the Configuration Memory of SRAM based FPGA

    Full text link
    Efficient low complexity error correcting code(ECC) is considered as an effective technique for mitigation of multi-bit upset (MBU) in the configuration memory(CM)of static random access memory (SRAM) based Field Programmable Gate Array (FPGA) devices. Traditional multi-bit ECCs have large overhead and complex decoding circuit to correct adjacent multibit error. In this work, we propose a simple multi-bit ECC which uses Secure Hash Algorithm for error detection and parity based two dimensional Erasure Product Code for error correction. Present error mitigation techniques perform error correction in the CM without considering the criticality or the execution period of the tasks allocated in different portion of CM. In most of the cases, error correction is not done in the right instant, which sometimes either suspends normal system operation or wastes hardware resources for less critical tasks. In this paper,we advocate for a dynamic priority-based hardware scheduling algorithm which chooses the tasks for error correction based on their area, execution period and criticality. The proposed method has been validated in terms of overhead due to redundant bits, error correction time and system reliabilityComment: 6 pages, 8 figures, conferenc

    SecuCode: Intrinsic PUF Entangled Secure Wireless Code Dissemination for Computational RFID Devices

    Full text link
    The simplicity of deployment and perpetual operation of energy harvesting devices provides a compelling proposition for a new class of edge devices for the Internet of Things. In particular, Computational Radio Frequency Identification (CRFID) devices are an emerging class of battery-free, computational, sensing enhanced devices that harvest all of their energy for operation. Despite wireless connectivity and powering, secure wireless firmware updates remains an open challenge for CRFID devices due to: intermittent powering, limited computational capabilities, and the absence of a supervisory operating system. We present, for the first time, a secure wireless code dissemination (SecuCode) mechanism for CRFIDs by entangling a device intrinsic hardware security primitive Static Random Access Memory Physical Unclonable Function (SRAM PUF) to a firmware update protocol. The design of SecuCode: i) overcomes the resource-constrained and intermittently powered nature of the CRFID devices; ii) is fully compatible with existing communication protocols employed by CRFID devices in particular, ISO-18000-6C protocol; and ii) is built upon a standard and industry compliant firmware compilation and update method realized by extending a recent framework for firmware updates provided by Texas Instruments. We build an end-to-end SecuCode implementation and conduct extensive experiments to demonstrate standards compliance, evaluate performance and security.Comment: Accepted to the IEEE Transactions on Dependable and Secure Computin

    Statistical analysis and comparison of 2T and 3T1D e-DRAM minimum energy operation

    Get PDF
    Bio-medical wearable devices restricted to their small-capacity embedded-battery require energy-efficiency of the highest order. However, minimum-energy point (MEP) at sub-threshold voltages is unattainable with SRAM memory, which fails to hold below 0.3V because of its vanishing noise margins. This paper examines the minimum-energy operation point of 2T and 3T1D e-DRAM gain cells at the 32-nm technology node with different design points: up-sizing transistors, using high- V th transistors, read/write wordline assists; as well as operating conditions (i.e., temperature). First, the e-DRAM cells are evaluated without considering any process variations. Then, a full-factorial statistical analysis of e-DRAM cells is performed in the presence of threshold voltage variations and the effect of upsizing on mean MEP is reported. Finally, it is shown that the product of the read and write lengths provides a knob to tradeoff energy-efficiency for reliable MEP energy operation.Peer ReviewedPostprint (author's final draft

    An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics

    Full text link
    Near-sensor data analytics is a promising direction for IoT endpoints, as it minimizes energy spent on communication and reduces network load - but it also poses security concerns, as valuable data is stored or sent over the network at various stages of the analytics pipeline. Using encryption to protect sensitive data at the boundary of the on-chip analytics engine is a way to address data security issues. To cope with the combined workload of analytics and encryption in a tight power envelope, we propose Fulmine, a System-on-Chip based on a tightly-coupled multi-core cluster augmented with specialized blocks for compute-intensive data processing and encryption functions, supporting software programmability for regular computing tasks. The Fulmine SoC, fabricated in 65nm technology, consumes less than 20mW on average at 0.8V achieving an efficiency of up to 70pJ/B in encryption, 50pJ/px in convolution, or up to 25MIPS/mW in software. As a strong argument for real-life flexible application of our platform, we show experimental results for three secure analytics use cases: secure autonomous aerial surveillance with a state-of-the-art deep CNN consuming 3.16pJ per equivalent RISC op; local CNN-based face detection with secured remote recognition in 5.74pJ/op; and seizure detection with encrypted data collection from EEG within 12.7pJ/op.Comment: 15 pages, 12 figures, accepted for publication to the IEEE Transactions on Circuits and Systems - I: Regular Paper
    corecore