545 research outputs found

    Accurate deep neural network inference using computational phase-change memory

    Get PDF
    In-memory computing is a promising non-von Neumann approach for making energy-efficient deep learning inference hardware. Crossbar arrays of resistive memory devices can be used to encode the network weights and perform efficient analog matrix-vector multiplications without intermediate movements of data. However, due to device variability and noise, the network needs to be trained in a specific way so that transferring the digitally trained weights to the analog resistive memory devices will not result in significant loss of accuracy. Here, we introduce a methodology to train ResNet-type convolutional neural networks that results in no appreciable accuracy loss when transferring weights to in-memory computing hardware based on phase-change memory (PCM). We also propose a compensation technique that exploits the batch normalization parameters to improve the accuracy retention over time. We achieve a classification accuracy of 93.7% on the CIFAR-10 dataset and a top-1 accuracy on the ImageNet benchmark of 71.6% after mapping the trained weights to PCM. Our hardware results on CIFAR-10 with ResNet-32 demonstrate an accuracy above 93.5% retained over a one day period, where each of the 361,722 synaptic weights of the network is programmed on just two PCM devices organized in a differential configuration.Comment: This is a pre-print of an article accepted for publication in Nature Communication

    Adaptive extreme edge computing for wearable devices

    Get PDF
    Wearable devices are a fast-growing technology with impact on personal healthcare for both society and economy. Due to the widespread of sensors in pervasive and distributed networks, power consumption, processing speed, and system adaptation are vital in future smart wearable devices. The visioning and forecasting of how to bring computation to the edge in smart sensors have already begun, with an aspiration to provide adaptive extreme edge computing. Here, we provide a holistic view of hardware and theoretical solutions towards smart wearable devices that can provide guidance to research in this pervasive computing era. We propose various solutions for biologically plausible models for continual learning in neuromorphic computing technologies for wearable sensors. To envision this concept, we provide a systematic outline in which prospective low power and low latency scenarios of wearable sensors in neuromorphic platforms are expected. We successively describe vital potential landscapes of neuromorphic processors exploiting complementary metal-oxide semiconductors (CMOS) and emerging memory technologies (e.g. memristive devices). Furthermore, we evaluate the requirements for edge computing within wearable devices in terms of footprint, power consumption, latency, and data size. We additionally investigate the challenges beyond neuromorphic computing hardware, algorithms and devices that could impede enhancement of adaptive edge computing in smart wearable devices

    Hardware Considerations for Signal Processing Systems: A Step Toward the Unconventional.

    Full text link
    As we progress into the future, signal processing algorithms are becoming more computationally intensive and power hungry while the desire for mobile products and low power devices is also increasing. An integrated ASIC solution is one of the primary ways chip developers can improve performance and add functionality while keeping the power budget low. This work discusses ASIC hardware for both conventional and unconventional signal processing systems, and how integration, error resilience, emerging devices, and new algorithms can be leveraged by signal processing systems to further improve performance and enable new applications. Specifically this work presents three case studies: 1) a conventional and highly parallel mix signal cross-correlator ASIC for a weather satellite performing real-time synthetic aperture imaging, 2) an unconventional native stochastic computing architecture enabled by memristors, and 3) two unconventional sparse neural network ASICs for feature extraction and object classification. As improvements from technology scaling alone slow down, and the demand for energy efficient mobile electronics increases, such optimization techniques at the device, circuit, and system level will become more critical to advance signal processing capabilities in the future.PhDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/116685/1/knagphil_1.pd

    Overcoming Noise and Variations In Low-Precision Neural Networks

    Get PDF
    This work explores the impact of various design and training choices on the resilience of a neural network when subjected to noise and/or device variations. Simulations were performed under the expectation that the neural network would be implemented on analog hardware; this context asserts that there will be random noise within the circuit as well as variations in device characteristics between each fabricated device. The results show how noise can be added during the training process to reduce the impact of post-training noise. Architectural choices for the neural network also directly impact the performance variation between devices. The simulated neural networks were more robust to noise with a minimal architecture with fewer layers; if more neurons are needed for better fitting, networks with more neurons in shallow layers and fewer in deeper layers closer to the output tend to perform better. The paper also demonstrates that activation functions with lower slopes do a better job of suppressing noise in the neural network. It also shown that the accuracy can be made more consistent by introducing sparsity into the neural network. To that end, an evaluation is included of different methods for generating sparse architectures for smaller neural networks. A new method is proposed that consistently outperforms the most common methods used in larger, deeper networks.Ph.D

    Energy-Efficient Circuit Designs for Miniaturized Internet of Things and Wireless Neural Recording

    Full text link
    Internet of Things (IoT) have become omnipresent over various territories including healthcare, smart building, agriculture, and environmental and industrial monitoring. Today, IoT are getting miniaturized, but at the same time, they are becoming more intelligent along with the explosive growth of machine learning. Not only do IoT sense and collect data and communicate, but they also edge-compute and extract useful information within the small form factor. A main challenge of such miniaturized and intelligent IoT is to operate continuously for long lifetime within its low battery capacity. Energy efficiency of circuits and systems is key to addressing this challenge. This dissertation presents two different energy-efficient circuit designs: a 224pW 260ppm/°C gate-leakage-based timer for wireless sensor nodes (WSNs) for the IoT and an energy-efficient all analog machine learning accelerator with 1.2 µJ/inference of energy consumption for the CIFAR-10 and SVHN datasets. Wireless neural interface is another area that demands miniaturized and energy-efficient circuits and systems for safe long-term monitoring of brain activity. Historically, implantable systems have used wires for data communication and power, increasing risks of tissue damage. Therefore, it has been a long-standing goal to distribute sub-mm-scale true floating and wireless implants throughout the brain and to record single-neuron-level activities. This dissertation presents a 0.19×0.17mm2 0.74µW wireless neural recording IC with near-infrared (NIR) power and data telemetry and a 0.19×0.28mm2 0.57µW light tolerant wireless neural recording IC.PHDElectrical and Computer EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/169712/1/jongyup_1.pd

    Data Acquisition Applications

    Get PDF
    Data acquisition systems have numerous applications. This book has a total of 13 chapters and is divided into three sections: Industrial applications, Medical applications and Scientific experiments. The chapters are written by experts from around the world, while the targeted audience for this book includes professionals who are designers or researchers in the field of data acquisition systems. Faculty members and graduate students could also benefit from the book
    • …
    corecore