18 research outputs found

    In-Memory Computing by Using Nano-ionic Memristive Devices

    Get PDF
    By reaching to the CMOS scaling limitation based on the Moore’s law and due to the increasing disparity between the processing units and memory performance, the quest is continued to find a suitable alternative to replace the conventional technology. The recently discovered two terminal element, memristor, is believed to be one of the most promising candidates for future very large scale integrated systems. This thesis is comprised of two main parts, (Part I) modeling the memristor devices, and (Part II) memristive computing. The first part is presented in one chapter and the second part of the thesis contains five chapters. The basics and fundamentals regarding the memristor functionality and memristive computing are presented in the introduction chapter. A brief detail of these two main parts is as follows: Part I: Modeling- This part presents an accurate model based on the charge transport mechanisms for nanoionic memristor devices. The main current mechanism in metal/insulator/metal (MIM) structures are assessed, a physic-based model is proposed and a SPICE model is presented and tested for four different fabricated devices. An accuracy comparison is done for various models for Ag/TiO2/ITO fabricated device. Also, the functionality of the model is tested for various input signals. Part II: Memristive computing- Memristive computing is about utilizing memristor to perform computational tasks. This part of the thesis is divided into neuromorphic, analog and digital computing schemes with memristor devices. – Neuromorphic computing- Two chapters of this thesis are about biologicalinspired memristive neural networks using STDP-based learning mechanism. The memristive implementation of two well-known spiking neuron models, Hudgkin-Huxley and Morris-Lecar, are assessed and utilized in the proposed memristive network. The synaptic connections are also memristor devices in this design. Unsupervised pattern classification tasks are done to ensure the right functionality of the system. – Analog computing- Memristor has analog memory property as it can be programmed to different memristance values. A novel memristive analog adder is designed by Continuous Valued Number System (CVNS) scheme and its circuit is comprised of addition and modulo blocks. The proposed analog adder design is explained and its functionality is tested for various numbers. It is shown that the CVNS scheme is compatible with memristive design and the environment resolution can be adjusted by the memristance ratio of the memristor devices. – Digital computing- Two chapters are dedicated for digital computing. In the first one, a development over IMPLY-based logic with memristor is provided to implement a 4:2 compressor circuit. In the second chapter, A novel resistive over a novel mirrored memristive crossbar platform. Different logic gates are designed with the proposed memristive logic method and the simulations are provided with Cadence to prove the functionality of the logic. The logic implementation over a mirrored memristive crossbars is also assessed

    Memristor-Based Digital Systems Design and Architectures

    Get PDF
    Memristor is considered as a suitable alternative solution to resolve the scaling limitation of CMOS technology. In recent years, the use of memristors in circuits design has rapidly increased and attracted researcher’s interest. Advances have been made to both size and complexity of memristor designs. The development of CMOS transistors shows major concerns, such as, increased leakage power, reduced reliability, and high fabrication cost. These factors have affected chip manufacturing process and functionality severely. Therefore, the demand for new devices is increasing. Memristor, is considered as one of the key element in memory and information processing design due to its small size, long-term data storage, low power, and CMOS compatibility. The main objective in this research is to design memristor-based arithmetic circuits and to overcome some of the Memristor based logic design issues. In this thesis, a fast, low area and low power hybrid CMOS memristor based digital circuit design were implemented. Small and large-scale memristor based digital circuits are implemented and provided a solutions for overcoming the memristor degradation and fan-out challenges. As an example, a 4- bit LFSR has been implemented by using MRL scheme with 64 CMOS devices and 64 memristors. The proposed design is more efficient in terms of the area when compared with CMOS- based LFSR circuits. The simulation results proves the functionality of the design. This approach presents acceptable speed in comparison with CMOS-based design and it is faster than IMPLY-based memrisitive LFSR. The propped LFSR has 841 ps de-lay. Furthermore, the proposed design has a significant power reduction of over 66% less than CMOS-based approach. This thesis proposes implementation of memristive 2-D median filter and extends previously published works on memristive Filter design to include this emerging technology characteristics in image processing. The proposed circuit was designed based on Pt/TaOx/Ta redox-based device and Memristor Ratioed Logic (MRL). The proposed filter is designed in Cadence and the memristive median approved tested circuit is translated to Verilog-XL as a behavioral model. Different 512 _ 512 pixels input images contain salt and pepper noise with various noise density ratios are applied to the proposed median filter and the design successfully has substantially removed the noise. The implementation results in comparison with the conventional filters, it gives better Peak Signal to Noise Ratio (PSNR) and Mean Absolute Error (MAE) for different images with different noise density ratios while it saves more area as compared to CMOS-based design. This dissertation proposes a comprehensive framework for design, mapping and synthesis of large-scale memristor-CMOS circuits. This framework provides a synthesis approach that can be applied to all memristor-based digital logic designs. In particular, it is a proposal for a characterization methodology of memristor-based logic cells to generate a standard cell library for large scale simulation. The proposed framework is implemented in the Cadence Virtuoso schematic-level environment and was veri_ed with Verilog-XL, MATLAB, and the Electronic Design Automation (EDA) Synopses compiler after being translated to the behavioral level. The proposed method can be applied to implement any digital logic design. The frame work is deployed for design of the memristor-based parallel 8-bit adder/subtractor and a 2-D memristive-based median filter

    LSTM neural network implementation using memristive crossbar circuits and its various topologies

    Get PDF
    Neural Network (NN) algorithms have existed for long time now. However, they started to reemerge only after computers had been invented, because computational resources are required to implement NN algorithms. In fact, computers themselves are not fast enough to train and run the NNs. It can take days to train some complex neural networks for certain applications. One of the complex NNs that became widely used is Long-Short Term Memory (LSTM) NN algorithm. As a broader approach to increase the computation speed and decrease power consumption of neural network algorithms, hardware realizations of the neural networks have emerged. Mainly FPGA and analog hardware are used for these purposes. On this occasion, it happens to be only FPGA implementations of LSTM exist. Using this lack, this thesis work mainly aims to show that LSTM neural network is realizable and functional in analog hardware. In fact, analog hardware using memristive crossbars can be a potential solution to the speed bottleneck experienced in software implementations of LSTM and other complex neural networks in general. This work mainly focuses on implementation of already trained LSTM neural networks in analog circuitry. Since training consists of both forward and backward pass computations through NNs, first, there should be focus on implementing the circuitry that can run forward passes. This forward running circuit further can be extended to a complete circuit which would include training circuitry. Additionally, there exists various LSTM topologies. Software analysis has been done to compare the performance of each LSTM architecture for time-series prediction and time-series classification applications. Each of the architectures can be implemented in analog circuitry without great difficulty using voltage-based LSTM circuit parts due its easiness to reconfigure. Fully functional implementation of the voltage-based memristive LSTM in SPICE circuit simulator is the main contribution of this thesis work. In comparison, current-based LSTM circuit parts may not be easily rearranged due to the difficulty of passing currents from one stage to the next without degradation in magnitude

    Fast and Accurate Sparse Coding of Visual Stimuli with a Simple, Ultra-Low-Energy Spiking Architecture

    Get PDF
    Memristive crossbars have become a popular means for realizing unsupervised and supervised learning techniques. Often, to preserve mathematical rigor, the crossbar itself is separated from the neuron capacitors. In this work, we sought to simplify the design, removing extraneous components to consume significantly lower power at a minimal cost of accuracy. This work provides derivations for the design of such a network, named the Simple Spiking Locally Competitive Algorithm, or SSLCA, as well as CMOS designs and results on the CIFAR and MNIST datasets. Compared to a non-spiking model which scored 33% on CIFAR-10 with a single-layer classifier, this hardware scored 32% accuracy. When used with a state-of-the-art deep learning classifier, the non-spiking model achieved 82% and our simplified, spiking model achieved 80%, while compressing the input data by 79%. Compared to a previously proposed spiking model, our proposed hardware consumed 99% less energy to do the same work at 21 times the throughput. Accuracy held out with online learning to a write variance of 3% and a read variance of 40%. The proposed architecture\u27s excellent accuracy and significantly lower energy usage demonstrate the utility of our innovations. This work provides a means for extremely low-energy sparse coding in mobile devices, such as cellular phones, or for very sparse coding as is needed by self-driving cars or robotics that must integrate data from multiple, high-resolution sensors

    Towards Data Reliable, Low-Power, and Repairable Resistive Random Access Memories

    Get PDF
    A series of breakthroughs in memristive devices have demonstrated the potential of memristor arrays to serve as next generation resistive random access memories (ReRAM), which are fast, low-power, ultra-dense, and non-volatile. However, memristors' unique device characteristics also make them prone to several sources of error. Owing to the stochastic filamentary nature of memristive devices, various recoverable errors can affect the data reliability of a ReRAM. Permanent device failures further limit the lifetime of a ReRAM. This dissertation developed low-power solutions for more reliable and longer-enduring ReRAM systems. In this thesis, we first look into a data reliability issue known as write disturbance. Writing into a memristor in a crossbar could disturb the stored values in other memristors that are on the same memory line as the target cell. Such disturbance is accumulative over time which may lead to complete data corruption. To address this problem, we propose the use of two regular memristors on each word to keep track of the disturbance accumulation and trigger a refresh to restore the weakened data, once it becomes necessary. We also investigate the considerable variation in the write-time characteristics of individual memristors. With such variation, conventional fixed-pulse write schemes not only waste significant energy, but also cannot guarantee reliable completion of the write operations. We address such variation by proposing an adaptive write scheme that adjusts the width of the write pulses for each memristor. Our scheme embeds an online monitor to detect the completion of a write operation and takes into account the parasitic effect of line-shared devices in access-transistor-free memristive arrays. We further investigate the use of this method to shorten the test time of memory march algorithms by eliminating the need of a verifying read right after a write, which is commonly employed in the test sequences of march algorithms.Finally, we propose a novel mechanism to extend the lifetime of a ReRAM by protecting it against hard errors through the exploitation of a unique feature of bipolar memristive devices. Our solution proposes an unorthodox use of complementary resistive switches (a particular implementation of memristive devices) to provide an ``in-place spare'' for each memory cell at negligible extra cost. The in-place spares are then utilized by a repair scheme to repair memristive devices that have failed at a stuck-at-ON state at a page-level granularity. Furthermore, we explore the use of in-place spares in lieu of other memory reliability and yield enhancement solutions, such as error correction codes (ECC) and spare rows. We demonstrate that with the in-place spares, we can yield the same lifetime as a baseline ReRAM with either significantly fewer spare rows or a lighter-weight ECC, both of which can save on energy consumption and area

    Design Considerations for Training Memristor Crossbars Used in Spiking Neural Networks

    Get PDF
    CMOS/Memristor integrated architectures have shown to be powerful for realizing energy-efficient learning machines. These architectures are recently demonstrated in reservoir computing networks, which have reduced training complexity and resource utilization. In reservoir computing, the training time is curtailed due to random weight initialization in the hidden layer, which will remain constant during training. The CMOS/memristor variability can be exploited to generate these random weights and reduce the area overhead. Recent studies have shown that the CMOS/memristor crossbars are ideal for on-device learning machines, including reservoir computing networks. An exemplary CMOS/memristor crossbar based on-device accelerator, Ziksa, was demonstrated on several of these learning networks. While the crossbars are generally area and energy efficient, the peripheral circuitry to control the read/write logic to the crossbars is extremely power hungry. This work focuses on improving the Ziksa accelerator peripheral circuitry for a spiking reservoir network. The optimized training circuitry for Ziksa includes transmission gates, a control unit, and a current amplifier and is demonstrated within a layer of spiking neurons for training and neuron behavior. All the analog circuits are validated using the Cadence 45 nm GPDK on a 2x4 and 1x4 crossbar. For a 32x32 crossbar, the area and power of the peripheral circuitry is ∼2,800 µm^2 and ∼3.685 mW respectively, demonstrating the overall efficacy of the proposed circuits

    Energy Efficient Neocortex-Inspired Systems with On-Device Learning

    Get PDF
    Shifting the compute workloads from cloud toward edge devices can significantly improve the overall latency for inference and learning. On the contrary this paradigm shift exacerbates the resource constraints on the edge devices. Neuromorphic computing architectures, inspired by the neural processes, are natural substrates for edge devices. They offer co-located memory, in-situ training, energy efficiency, high memory density, and compute capacity in a small form factor. Owing to these features, in the recent past, there has been a rapid proliferation of hybrid CMOS/Memristor neuromorphic computing systems. However, most of these systems offer limited plasticity, target either spatial or temporal input streams, and are not demonstrated on large scale heterogeneous tasks. There is a critical knowledge gap in designing scalable neuromorphic systems that can support hybrid plasticity for spatio-temporal input streams on edge devices. This research proposes Pyragrid, a low latency and energy efficient neuromorphic computing system for processing spatio-temporal information natively on the edge. Pyragrid is a full-scale custom hybrid CMOS/Memristor architecture with analog computational modules and an underlying digital communication scheme. Pyragrid is designed for hierarchical temporal memory, a biomimetic sequence memory algorithm inspired by the neocortex. It features a novel synthetic synapses representation that enables dynamic synaptic pathways with reduced memory usage and interconnects. The dynamic growth in the synaptic pathways is emulated in the memristor device physical behavior, while the synaptic modulation is enabled through a custom training scheme optimized for area and power. Pyragrid features data reuse, in-memory computing, and event-driven sparse local computing to reduce data movement by ~44x and maximize system throughput and power efficiency by ~3x and ~161x over custom CMOS digital design. The innate sparsity in Pyragrid results in overall robustness to noise and device failure, particularly when processing visual input and predicting time series sequences. Porting the proposed system on edge devices can enhance their computational capability, response time, and battery life

    Reliability-aware memory design using advanced reconfiguration mechanisms

    Get PDF
    Fast and Complex Data Memory systems has become a necessity in modern computational units in today's integrated circuits. These memory systems are integrated in form of large embedded memory for data manipulation and storage. This goal has been achieved by the aggressive scaling of transistor dimensions to few nanometer (nm) sizes, though; such a progress comes with a drawback, making it critical to obtain high yields of the chips. Process variability, due to manufacturing imperfections, along with temporal aging, mainly induced by higher electric fields and temperature, are two of the more significant threats that can no longer be ignored in nano-scale embedded memory circuits, and can have high impact on their robustness. Static Random Access Memory (SRAM) is one of the most used embedded memories; generally implemented with the smallest device dimensions and therefore its robustness can be highly important in nanometer domain design paradigm. Their reliable operation needs to be considered and achieved both in cell and also in architectural SRAM array design. Recently, and with the approach to near/below 10nm design generations, novel non-FET devices such as Memristors are attracting high attention as a possible candidate to replace the conventional memory technologies. In spite of their favorable characteristics such as being low power and highly scalable, they also suffer with reliability challenges, such as process variability and endurance degradation, which needs to be mitigated at device and architectural level. This thesis work tackles such problem of reliability concerns in memories by utilizing advanced reconfiguration techniques. In both SRAM arrays and Memristive crossbar memories novel reconfiguration strategies are considered and analyzed, which can extend the memory lifetime. These techniques include monitoring circuits to check the reliability status of the memory units, and architectural implementations in order to reconfigure the memory system to a more reliable configuration before a fail happens.Actualmente, el diseño de sistemas de memoria en circuitos integrados busca continuamente que sean más rápidos y complejos, lo cual se ha vuelto de gran necesidad para las unidades de computación modernas. Estos sistemas de memoria están integrados en forma de memoria embebida para una mejor manipulación de los datos y de su almacenamiento. Dicho objetivo ha sido conseguido gracias al agresivo escalado de las dimensiones del transistor, el cual está llegando a las dimensiones nanométricas. Ahora bien, tal progreso ha conllevado el inconveniente de una menor fiabilidad, dado que ha sido altamente difícil obtener elevados rendimientos de los chips. La variabilidad de proceso - debido a las imperfecciones de fabricación - junto con la degradación de los dispositivos - principalmente inducido por el elevado campo eléctrico y altas temperaturas - son dos de las más relevantes amenazas que no pueden ni deben ser ignoradas por más tiempo en los circuitos embebidos de memoria, echo que puede tener un elevado impacto en su robusteza final. Static Random Access Memory (SRAM) es una de las celdas de memoria más utilizadas en la actualidad. Generalmente, estas celdas son implementadas con las menores dimensiones de dispositivos, lo que conlleva que el estudio de su robusteza es de gran relevancia en el actual paradigma de diseño en el rango nanométrico. La fiabilidad de sus operaciones necesita ser considerada y conseguida tanto a nivel de celda de memoria como en el diseño de arquitecturas complejas basadas en celdas de memoria SRAM. Actualmente, con el diseño de sistemas basados en dispositivos de 10nm, dispositivos nuevos no-FET tales como los memristores están atrayendo una elevada atención como posibles candidatos para reemplazar las actuales tecnologías de memorias convencionales. A pesar de sus características favorables, tales como el bajo consumo como la alta escabilidad, ellos también padecen de relevantes retos de fiabilidad, como son la variabilidad de proceso y la degradación de la resistencia, la cual necesita ser mitigada tanto a nivel de dispositivo como a nivel arquitectural. Con todo esto, esta tesis doctoral afronta tales problemas de fiabilidad en memorias mediante la utilización de técnicas de reconfiguración avanzada. La consideración de nuevas estrategias de reconfiguración han resultado ser validas tanto para las memorias basadas en celdas SRAM como en `memristive crossbar¿, donde se ha observado una mejora significativa del tiempo de vida en ambos casos. Estas técnicas incluyen circuitos de monitorización para comprobar la fiabilidad de las unidades de memoria, y la implementación arquitectural con el objetivo de reconfigurar los sistemas de memoria hacia una configuración mucho más fiables antes de que el fallo suced
    corecore