12 research outputs found

    Brain-Inspired Computing: Neuromorphic System Designs and Applications

    Get PDF
    In nowadays big data environment, the conventional computing platform based on von Neumann architecture encounters the bottleneck of the increasing requirement of computation capability and efficiency. The “brain-inspired computing” Neuromorphic Computing has demonstrated great potential to revolutionize the technology world. It is considered as one of the most promising solutions by achieving tremendous computing and power efficiency on a single chip. The neuromorphic computing systems represent great promise for many scientific and intelligent applications. Many designs have been proposed and realized with traditional CMOS technology, however, the progress is slow. Recently, the rebirth of neuromorphic computing is inspired by the development of novel nanotechnology. In this thesis, I propose neuromorphic computing systems with the ReRAM (Memristor) crossbar array. It includes the work in three major parts: 1) Memristor devices modeling and related circuits design in resistive memory (ReRAM) technology by investigating their physical mechanism, statistical analysis, and intrinsic challenges. A weighted sensing scheme which assigns different weights to the cells on different bit lines was proposed. The area/power overhead of peripheral circuitry was effectively reduced while minimizing the amplitude of sneak paths. 2) Neuromorphic computing system designs by leveraging memristor devices and algorithm scaling in neural network and machine learning algorithms based on the similarity between memristive effect and biological synaptic behavior. First, a spiking neural network (SNN) with a rate coding model was developed in algorithm level and then mapped to hardware design for supervised learning. In addition, to further speed and accuracy improvement, another neuromorphic system adopting analog input signals with different voltage amplitude and a current sensing scheme was built. Moreover, the use of a single memristor crossbar for each neural net- work layer was explored. 3) The application-specific optimization for further reliability improvement of the developed neuromorphic systems. In this thesis, the impact of device failure on the memristor-based neuromorphic computing systems for cognitive applications was evaluated. Then, a retraining and a remapping design in algorithm level and hardware level were developed to rescue the large accuracy loss

    Spiking CMOS-NVM mixed-signal neuromorphic ConvNet with circuit- and training-optimized temporal subsampling

    Get PDF
    We increasingly rely on deep learning algorithms to process colossal amount of unstructured visual data. Commonly, these deep learning algorithms are deployed as software models on digital hardware, predominantly in data centers. Intrinsic high energy consumption of Cloud-based deployment of deep neural networks (DNNs) inspired researchers to look for alternatives, resulting in a high interest in Spiking Neural Networks (SNNs) and dedicated mixed-signal neuromorphic hardware. As a result, there is an emerging challenge to transfer DNN architecture functionality to energy-efficient spiking non-volatile memory (NVM)-based hardware with minimal loss in the accuracy of visual data processing. Convolutional Neural Network (CNN) is the staple choice of DNN for visual data processing. However, the lack of analog-friendly spiking implementations and alternatives for some core CNN functions, such as MaxPool, hinders the conversion of CNNs into the spike domain, thus hampering neuromorphic hardware development. To address this gap, in this work, we propose MaxPool with temporal multiplexing for Spiking CNNs (SCNNs), which is amenable for implementation in mixed-signal circuits. In this work, we leverage the temporal dynamics of internal membrane potential of Integrate & Fire neurons to enable MaxPool decision-making in the spiking domain. The proposed MaxPool models are implemented and tested within the SCNN architecture using a modified version of the aihwkit framework, a PyTorch-based toolkit for modeling and simulating hardware-based neural networks. The proposed spiking MaxPool scheme can decide even before the complete spatiotemporal input is applied, thus selectively trading off latency with accuracy. It is observed that by allocating just 10% of the spatiotemporal input window for a pooling decision, the proposed spiking MaxPool achieves up to 61.74% accuracy with a 2-bit weight resolution in the CIFAR10 dataset classification task after training with back propagation, with only about 1% performance drop compared to 62.78% accuracy of the 100% spatiotemporal window case with the 2-bit weight resolution to reflect foundry-integrated ReRAM limitations. In addition, we propose the realization of one of the proposed spiking MaxPool techniques in an NVM crossbar array along with periphery circuits designed in a 130nm CMOS technology. The energy-efficiency estimation results show competitive performance compared to recent neuromorphic chip designs

    Hybrid Memristor-CMOS Computer for Artificial Intelligence: from Devices to Systems

    Full text link
    Neuromorphic computing systems, which aim to mimic the function and structure of the human brain, is a promising approach to overcome the limitations of conventional computing systems such as the von-Neumann bottleneck. Recently, memristors and memristor crossbars have been extensively studied for neuromorphic system implementations due to the ability of memristor devices to emulate biological synapses, thus providing benefits such as co-located memory/logic operations and massive parallelism. A memristor is a two-terminal device whose resistance is modulated by the history of external stimulation. The principle of the resistance modulation, or resistance switching, for a typical oxide-based memristor, is based on oxygen vacancy migration in the oxide layer through ion drift and diffusion. When applied in computing systems, the memristor is often formed in a crossbar structure and used to perform vector-matrix multiplication operations. Since the values in the matrix can be stored as the device conductance values of the crossbar array, when an input vector is applied as voltage pulses with different pulse amplitudes or different pulse widths to the rows of the crossbar, the currents or charges collected at the columns of the crossbar correspond to the resulting VMM outputs, following Ohm’s law and Kirchhoff’s current law. This approach makes it possible to use physics to execute direct computing of this data-intensive task, both in-memory and in parallel in a single step. First of all, I will present a comprehensive physical model of the TaOx-based memristor device where the internal parameters including electric field, temperature, and VO concentration are self-consistently solved to accurately describe the device operation. Starting from the initial Forming process, the model quantitatively captures the dynamic RS behavior, and can reliably reproduce Set/Reset cycling in a self-consistent manner. Beyond clarifying the nature of the Forming and Set/Reset processes, a bulk-like doping effect was revealed by the model during Set and supported by experimental results. This phenomenon can lead to linear analog conductance modulation with a large dynamic range, which is very beneficial for low-power neuromorphic computing applications. Second, an integrated memristor/CMOS system consisting of a 54×108 passive memristor crossbar array directly fabricated on a CMOS chip is presented. The system includes all necessary analog/digital circuitry (including analog-digital converters and digital-analog converters), digital buses, and a programmable processor to control the digital and analog components to form a complete hardware system for neuromorphic computing applications. With the fully-integrated and reprogrammable chip, we experimentally demonstrated three popular models – a perceptron network, a sparse coding network, and a bilayer principal component analysis system with an unsupervised feature extraction layer and a supervised classification layer – all on the same chip. Beyond VMM operations, the internal dynamics of memristors allow the system to natively process temporal features in the input data. Specifically, a WOx-based memristor with short-term memory effect caused by spontaneous oxygen vacancy diffusion was utilized to implement a reservoir computing system to process temporal information. The spatial information of a digit image can be converted into streaming inputs fed into the memristor reservoir, leading to 100% accuracy for simple 4×5 digit recognition and 88.1% accuracy for the MNIST data set. The system was also employed for solving other nonlinear tasks such as emulating a second-order nonlinear system.PHDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/155040/1/seulee_1.pd

    Reliability-aware memory design using advanced reconfiguration mechanisms

    Get PDF
    Fast and Complex Data Memory systems has become a necessity in modern computational units in today's integrated circuits. These memory systems are integrated in form of large embedded memory for data manipulation and storage. This goal has been achieved by the aggressive scaling of transistor dimensions to few nanometer (nm) sizes, though; such a progress comes with a drawback, making it critical to obtain high yields of the chips. Process variability, due to manufacturing imperfections, along with temporal aging, mainly induced by higher electric fields and temperature, are two of the more significant threats that can no longer be ignored in nano-scale embedded memory circuits, and can have high impact on their robustness. Static Random Access Memory (SRAM) is one of the most used embedded memories; generally implemented with the smallest device dimensions and therefore its robustness can be highly important in nanometer domain design paradigm. Their reliable operation needs to be considered and achieved both in cell and also in architectural SRAM array design. Recently, and with the approach to near/below 10nm design generations, novel non-FET devices such as Memristors are attracting high attention as a possible candidate to replace the conventional memory technologies. In spite of their favorable characteristics such as being low power and highly scalable, they also suffer with reliability challenges, such as process variability and endurance degradation, which needs to be mitigated at device and architectural level. This thesis work tackles such problem of reliability concerns in memories by utilizing advanced reconfiguration techniques. In both SRAM arrays and Memristive crossbar memories novel reconfiguration strategies are considered and analyzed, which can extend the memory lifetime. These techniques include monitoring circuits to check the reliability status of the memory units, and architectural implementations in order to reconfigure the memory system to a more reliable configuration before a fail happens.Actualmente, el diseño de sistemas de memoria en circuitos integrados busca continuamente que sean más rápidos y complejos, lo cual se ha vuelto de gran necesidad para las unidades de computación modernas. Estos sistemas de memoria están integrados en forma de memoria embebida para una mejor manipulación de los datos y de su almacenamiento. Dicho objetivo ha sido conseguido gracias al agresivo escalado de las dimensiones del transistor, el cual está llegando a las dimensiones nanométricas. Ahora bien, tal progreso ha conllevado el inconveniente de una menor fiabilidad, dado que ha sido altamente difícil obtener elevados rendimientos de los chips. La variabilidad de proceso - debido a las imperfecciones de fabricación - junto con la degradación de los dispositivos - principalmente inducido por el elevado campo eléctrico y altas temperaturas - son dos de las más relevantes amenazas que no pueden ni deben ser ignoradas por más tiempo en los circuitos embebidos de memoria, echo que puede tener un elevado impacto en su robusteza final. Static Random Access Memory (SRAM) es una de las celdas de memoria más utilizadas en la actualidad. Generalmente, estas celdas son implementadas con las menores dimensiones de dispositivos, lo que conlleva que el estudio de su robusteza es de gran relevancia en el actual paradigma de diseño en el rango nanométrico. La fiabilidad de sus operaciones necesita ser considerada y conseguida tanto a nivel de celda de memoria como en el diseño de arquitecturas complejas basadas en celdas de memoria SRAM. Actualmente, con el diseño de sistemas basados en dispositivos de 10nm, dispositivos nuevos no-FET tales como los memristores están atrayendo una elevada atención como posibles candidatos para reemplazar las actuales tecnologías de memorias convencionales. A pesar de sus características favorables, tales como el bajo consumo como la alta escabilidad, ellos también padecen de relevantes retos de fiabilidad, como son la variabilidad de proceso y la degradación de la resistencia, la cual necesita ser mitigada tanto a nivel de dispositivo como a nivel arquitectural. Con todo esto, esta tesis doctoral afronta tales problemas de fiabilidad en memorias mediante la utilización de técnicas de reconfiguración avanzada. La consideración de nuevas estrategias de reconfiguración han resultado ser validas tanto para las memorias basadas en celdas SRAM como en `memristive crossbar¿, donde se ha observado una mejora significativa del tiempo de vida en ambos casos. Estas técnicas incluyen circuitos de monitorización para comprobar la fiabilidad de las unidades de memoria, y la implementación arquitectural con el objetivo de reconfigurar los sistemas de memoria hacia una configuración mucho más fiables antes de que el fallo suced

    2022 roadmap on neuromorphic computing and engineering

    Full text link
    Modern computation based on von Neumann architecture is now a mature cutting-edge science. In the von Neumann architecture, processing and memory units are implemented as separate blocks interchanging data intensively and continuously. This data transfer is responsible for a large part of the power consumption. The next generation computer technology is expected to solve problems at the exascale with 1018^{18} calculations each second. Even though these future computers will be incredibly powerful, if they are based on von Neumann type architectures, they will consume between 20 and 30 megawatts of power and will not have intrinsic physically built-in capabilities to learn or deal with complex data as our brain does. These needs can be addressed by neuromorphic computing systems which are inspired by the biological concepts of the human brain. This new generation of computers has the potential to be used for the storage and processing of large amounts of digital information with much lower power consumption than conventional processors. Among their potential future applications, an important niche is moving the control from data centers to edge devices. The aim of this roadmap is to present a snapshot of the present state of neuromorphic technology and provide an opinion on the challenges and opportunities that the future holds in the major areas of neuromorphic technology, namely materials, devices, neuromorphic circuits, neuromorphic algorithms, applications, and ethics. The roadmap is a collection of perspectives where leading researchers in the neuromorphic community provide their own view about the current state and the future challenges for each research area. We hope that this roadmap will be a useful resource by providing a concise yet comprehensive introduction to readers outside this field, for those who are just entering the field, as well as providing future perspectives for those who are well established in the neuromorphic computing community

    Heterogeneous Reconfigurable Fabrics for In-circuit Training and Evaluation of Neuromorphic Architectures

    Get PDF
    A heterogeneous device technology reconfigurable logic fabric is proposed which leverages the cooperating advantages of distinct magnetic random access memory (MRAM)-based look-up tables (LUTs) to realize sequential logic circuits, along with conventional SRAM-based LUTs to realize combinational logic paths. The resulting Hybrid Spin/Charge FPGA (HSC-FPGA) using magnetic tunnel junction (MTJ) devices within this topology demonstrates commensurate reductions in area and power consumption over fabrics having LUTs constructed with either individual technology alone. Herein, a hierarchical top-down design approach is used to develop the HSCFPGA starting from the configurable logic block (CLB) and slice structures down to LUT circuits and the corresponding device fabrication paradigms. This facilitates a novel architectural approach to reduce leakage energy, minimize communication occurrence and energy cost by eliminating unnecessary data transfer, and support auto-tuning for resilience. Furthermore, HSC-FPGA enables new advantages of technology co-design which trades off alternative mappings between emerging devices and transistors at runtime by allowing dynamic remapping to adaptively leverage the intrinsic computing features of each device technology. HSC-FPGA offers a platform for fine-grained Logic-In-Memory architectures and runtime adaptive hardware. An orthogonal dimension of fabric heterogeneity is also non-determinism enabled by either low-voltage CMOS or probabilistic emerging devices. It can be realized using probabilistic devices within a reconfigurable network to blend deterministic and probabilistic computational models. Herein, consider the probabilistic spin logic p-bit device as a fabric element comprising a crossbar-structured weighted array. The Programmability of the resistive network interconnecting p-bit devices can be achieved by modifying the resistive states of the array\u27s weighted connections. Thus, the programmable weighted array forms a CLB-scale macro co-processing element with bitstream programmability. This allows field programmability for a wide range of classification problems and recognition tasks to allow fluid mappings of probabilistic and deterministic computing approaches. In particular, a Deep Belief Network (DBN) is implemented in the field using recurrent layers of co-processing elements to form an n x m1 x m2 x ::: x mi weighted array as a configurable hardware circuit with an n-input layer followed by i ≥ 1 hidden layers. As neuromorphic architectures using post-CMOS devices increase in capability and network size, the utility and benefits of reconfigurable fabrics of neuromorphic modules can be anticipated to continue to accelerate

    연성 및 생재흡수성 전자소자용 비휘발성 메모리 소자와 집적센서 구현

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 화학생물공학부, 2015. 8. 김대형.Over years, major advances in healthcare have been made through research in the fields of nanomaterials and microelectronics technologies. However, the mechanical and geometrical constraints inherent in the standard forms of rigid electronics have imposed challanges of unique integration and therapeutic delivery in non-invasive and minimally invasive medical devices. Here, we describe two types of multifunctional electronic systems. The first type is wearable-on-the-skin systems that address the challenges via monolithic integration of nanomembranes fabricated by top-down approach, nanotubes and nanoparticles assembled by bottom-up strategies, and stretchable electronics on tissue-like polymeric substrate. The system consists of physiological sensors, non-volatile memory, logic gates, and drug-release actuators. Some quantitative analyses on the operation of each electronics, mechanics, heat-transfer, and drug-diffusion characteristic validated their system-level multi-functionalities. The second type is a bioresorbable electronic stent with drug-infused functionalized nanoparticles that takes flow sensing, temperature monitoring, data storage, wireless power/data transmission, inflammation suppression, localized drug delivery, and photothermal therapy. In vivo and ex vivo animal experiments as well as in vitro cell researches demonstrate its unrecognized potential for bioresorbable electronic implants coupled with bioinert therapeutic nanoparticles in the endovascular system. As demonstrations of these technologies, we herein highlight two representative examples of multifunctional systems in order of increasing degree of invasiveness: electronically enabled wearable patch and endovascular electronic stent that incorporate onboard physiological monitoring, data storage, and therapy under moist and mechanically rigorous conditions.Contents Abstract Chapter 1. Introduction 1.1 Organic flexible and wearable electronics.................................................. 1 1.2 Inorganic flexible and wearable electronics............................................... 14 1.3 Flexible non-volatile memory devices.......................................................... 25 1.4 Bioresorbable materials and devices........................................................... 34 References Chapter 2. Multifunctional wearable devices for diagnosis and therapy of movement disorders 2.1 Introduction ................................................................................. 45 2.2 Experimental Section ......................................................................... 49 2.3 Result and Discussion ........................................................................ 65 2.4 Conclusion ................................................................................... 95 References Chapter 3. Stretchable Carbon Nanotube Charge-Trap Floating-Gate Memory and Logic Devices for Wearable Electronics 3.1 Introduction ................................................................................ 101 3.2 Experimental Section ........................................................................ 104 3.3 Result and Discussion ....................................................................... 107 3.4 Conclusion .................................................................................. 138 References Chapter 4. Bioresorbable Electronic Stent Integrated with Therapeutic Nanoparticles for Endovascular Diseases 4.1 Introduction ................................................................................ 148 4.2 Experimental Section ........................................................................ 151 4.3 Result and Discussion ....................................................................... 173 4.4 Conclusion .................................................................................. 219 References 국문 초록 (Abstract in Korean) .................................................................. 230Docto
    corecore