17 research outputs found

    Design of Resistive Synaptic Devices and Array Architectures for Neuromorphic Computing

    Get PDF
    abstract: Over the past few decades, the silicon complementary-metal-oxide-semiconductor (CMOS) technology has been greatly scaled down to achieve higher performance, density and lower power consumption. As the device dimension is approaching its fundamental physical limit, there is an increasing demand for exploration of emerging devices with distinct operating principles from conventional CMOS. In recent years, many efforts have been devoted in the research of next-generation emerging non-volatile memory (eNVM) technologies, such as resistive random access memory (RRAM) and phase change memory (PCM), to replace conventional digital memories (e.g. SRAM) for implementation of synapses in large-scale neuromorphic computing systems. Essentially being compact and “analog”, these eNVM devices in a crossbar array can compute vector-matrix multiplication in parallel, significantly speeding up the machine/deep learning algorithms. However, non-ideal eNVM device and array properties may hamper the learning accuracy. To quantify their impact, the sparse coding algorithm was used as a starting point, where the strategies to remedy the accuracy loss were proposed, and the circuit-level design trade-offs were also analyzed. At architecture level, the parallel “pseudo-crossbar” array to prevent the write disturbance issue was presented. The peripheral circuits to support various parallel array architectures were also designed. One key component is the read circuit that employs the principle of integrate-and-fire neuron model to convert the analog column current to digital output. However, the read circuit is not area-efficient, which was proposed to be replaced with a compact two-terminal oscillation neuron device that exhibits metal-insulator-transition phenomenon. To facilitate the design exploration, a circuit-level macro simulator “NeuroSim” was developed in C++ to estimate the area, latency, energy and leakage power of various neuromorphic architectures. NeuroSim provides a wide variety of design options at the circuit/device level. NeuroSim can be used alone or as a supporting module to provide circuit-level performance estimation in neural network algorithms. A 2-layer multilayer perceptron (MLP) simulator with integration of NeuroSim was demonstrated to evaluate both the learning accuracy and circuit-level performance metrics for the online learning and offline classification, as well as to study the impact of eNVM reliability issues such as data retention and write endurance on the learning performance.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    Memristor-Based Digital Systems Design and Architectures

    Get PDF
    Memristor is considered as a suitable alternative solution to resolve the scaling limitation of CMOS technology. In recent years, the use of memristors in circuits design has rapidly increased and attracted researcher’s interest. Advances have been made to both size and complexity of memristor designs. The development of CMOS transistors shows major concerns, such as, increased leakage power, reduced reliability, and high fabrication cost. These factors have affected chip manufacturing process and functionality severely. Therefore, the demand for new devices is increasing. Memristor, is considered as one of the key element in memory and information processing design due to its small size, long-term data storage, low power, and CMOS compatibility. The main objective in this research is to design memristor-based arithmetic circuits and to overcome some of the Memristor based logic design issues. In this thesis, a fast, low area and low power hybrid CMOS memristor based digital circuit design were implemented. Small and large-scale memristor based digital circuits are implemented and provided a solutions for overcoming the memristor degradation and fan-out challenges. As an example, a 4- bit LFSR has been implemented by using MRL scheme with 64 CMOS devices and 64 memristors. The proposed design is more efficient in terms of the area when compared with CMOS- based LFSR circuits. The simulation results proves the functionality of the design. This approach presents acceptable speed in comparison with CMOS-based design and it is faster than IMPLY-based memrisitive LFSR. The propped LFSR has 841 ps de-lay. Furthermore, the proposed design has a significant power reduction of over 66% less than CMOS-based approach. This thesis proposes implementation of memristive 2-D median filter and extends previously published works on memristive Filter design to include this emerging technology characteristics in image processing. The proposed circuit was designed based on Pt/TaOx/Ta redox-based device and Memristor Ratioed Logic (MRL). The proposed filter is designed in Cadence and the memristive median approved tested circuit is translated to Verilog-XL as a behavioral model. Different 512 _ 512 pixels input images contain salt and pepper noise with various noise density ratios are applied to the proposed median filter and the design successfully has substantially removed the noise. The implementation results in comparison with the conventional filters, it gives better Peak Signal to Noise Ratio (PSNR) and Mean Absolute Error (MAE) for different images with different noise density ratios while it saves more area as compared to CMOS-based design. This dissertation proposes a comprehensive framework for design, mapping and synthesis of large-scale memristor-CMOS circuits. This framework provides a synthesis approach that can be applied to all memristor-based digital logic designs. In particular, it is a proposal for a characterization methodology of memristor-based logic cells to generate a standard cell library for large scale simulation. The proposed framework is implemented in the Cadence Virtuoso schematic-level environment and was veri_ed with Verilog-XL, MATLAB, and the Electronic Design Automation (EDA) Synopses compiler after being translated to the behavioral level. The proposed method can be applied to implement any digital logic design. The frame work is deployed for design of the memristor-based parallel 8-bit adder/subtractor and a 2-D memristive-based median filter

    Study of RRAM-Based Binarized Neural Networks Inference Accelerators Using an RRAM Physics-Based Compact Model

    Get PDF
    In-memory computing hardware accelerators for binarized neural networks based on resistive RAM (RRAM) memory technologies represent a promising solution for enabling the execution of deep neural network algorithms on resource-constrained devices at the edge of the network. However, the intrinsic stochasticity and nonidealities of RRAM devices can easily lead to unreliable circuit operations if not appropriately considered during the design phase. In this chapter, analysis and design methodologies enabled by RRAM physics-based compact models of LIM and mixed-signal BNN inference accelerators are discussed. As a use case example, the UNIMORE RRAM physics-based compact model calibrated on an RRAM technology from the literature, is used to determine the performance vs. reliability trade-offs of different in-memory computing accelerators: i) a logic-in-memory accelerator based on the material implication logic, ii) a mixed-signal BNN accelerator, and iii) a hybrid accelerator enabling both computing paradigms on the same array. Finally, the performance of the three accelerators on a BNN inference task is compared and benchmarked with the state of the art

    Leveraging RRAM to Design Efficient Digital Circuits and Systems for Beyond Von Neumann in-Memory Computing

    Get PDF
    Due to the physical separation of their processing elements and storage units, contemporary digital computers are confronted with the thorny memory-wall problem. The strategy of in-memory computing has been considered as a promising solution to overcome the von Neumann bottleneck and design high-performance, energy-efficient computing systems. Moreover, in the post Moore era, post-CMOS technologies have received intense interests for possible future digital logic applications beyond the CMOS scaling limits. Motivated by these perspectives from system level to device level, this thesis proposes two effective processing-in-memory schemes to construct the non-von Neumann systems based on nonvolatile resistive random-access memory (RRAM). In the first scheme, we present functionally complete stateful logic gates based on a CMOS-compatible 2-transistor-2-RRAM (2T2R) structure. In this structure, the programmable logic functionality is determined by the amplitude of operation voltages, rather than its circuit topology. A reconfigurable 3T2R chain with programmable interconnects is used to implement complex combinational logic circuits. The design has a highly regular and symmetric circuit structure, making it easy for design, integration, and fabrication, while the operations are flexible yet clean. Easily integrated as 3-dimensional (3-D) stacked arrays, two proposed memory architectures not only serve as regular 3-D memory arrays but also perform in-memory-computing within the same layer and between the stacked layers. The second scheme leverages hybrid logic in the same hardware to design efficient digital circuits and systems with low computational complexity. Multiple-bit ripple-carry adder (RCA), pipelined RCA, and prefix tree adder are shown as example circuits, using the same regular chain structure, to validate the design efficiency. The design principles, computational complexity, and performance are discussed and compared to the CMOS technology and other state-of-the-art post-CMOS implementations. The overall evaluation shows superior performance in speed and area. The result of the study could build a technology cell library that can be potentially used as input to a technology-mapping algorithm. The proposed hybrid-logic methodology presents prospect of hardware acceleration and future beyond-von Neumann in-memory computing architectures

    Modeling Emerging Semiconductor Devices for Circuit Simulation

    Get PDF
    Circuit simulation is an indispensable part of modern IC design. The significant cost of fabrication has driven researchers to verify the chip functionality through simulation before submitting the design for final fabrication. With the impending end of Moore’s Law, researchers all over the world are looking for new devices with enhanced functionality. A plethora of promising emerging devices has been proposed in recent years. In order to leverage the full potential of such devices, circuit designers need fast, reliable models for SPICE simulation to explore different applications. Most of these new devices have complex underlying physical mechanism rendering the model development an extremely challenging task. For the models to be of practical use, they have to enable fast and accurate simulation that rules out the possibility of numerically solving a system of partial differential equations to arrive at a solution. In this chapter, we show how different modeling approaches can be used to simulate three emerging semiconductor devices namely, silicon- on- insulator four gate transistor(G4FET), perimeter gated single photon avalanche diode (PG-SPAD) and insulator-metal transistor (IMT) device with volatile memristance. All the models have been verified against experimental /TCAD data and implemented in commercial circuit simulator

    Semiconductor Memory Applications in Radiation Environment, Hardware Security and Machine Learning System

    Get PDF
    abstract: Semiconductor memory is a key component of the computing systems. Beyond the conventional memory and data storage applications, in this dissertation, both mainstream and eNVM memory technologies are explored for radiation environment, hardware security system and machine learning applications. In the radiation environment, e.g. aerospace, the memory devices face different energetic particles. The strike of these energetic particles can generate electron-hole pairs (directly or indirectly) as they pass through the semiconductor device, resulting in photo-induced current, and may change the memory state. First, the trend of radiation effects of the mainstream memory technologies with technology node scaling is reviewed. Then, single event effects of the oxide based resistive switching random memory (RRAM), one of eNVM technologies, is investigated from the circuit-level to the system level. Physical Unclonable Function (PUF) has been widely investigated as a promising hardware security primitive, which employs the inherent randomness in a physical system (e.g. the intrinsic semiconductor manufacturing variability). In the dissertation, two RRAM-based PUF implementations are proposed for cryptographic key generation (weak PUF) and device authentication (strong PUF), respectively. The performance of the RRAM PUFs are evaluated with experiment and simulation. The impact of non-ideal circuit effects on the performance of the PUFs is also investigated and optimization strategies are proposed to solve the non-ideal effects. Besides, the security resistance against modeling and machine learning attacks is analyzed as well. Deep neural networks (DNNs) have shown remarkable improvements in various intelligent applications such as image classification, speech classification and object localization and detection. Increasing efforts have been devoted to develop hardware accelerators. In this dissertation, two types of compute-in-memory (CIM) based hardware accelerator designs with SRAM and eNVM technologies are proposed for two binary neural networks, i.e. hybrid BNN (HBNN) and XNOR-BNN, respectively, which are explored for the hardware resource-limited platforms, e.g. edge devices.. These designs feature with high the throughput, scalability, low latency and high energy efficiency. Finally, we have successfully taped-out and validated the proposed designs with SRAM technology in TSMC 65 nm. Overall, this dissertation paves the paths for memory technologies’ new applications towards the secure and energy-efficient artificial intelligence system.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    Low Power Memory/Memristor Devices and Systems

    Get PDF
    This reprint focusses on achieving low-power computation using memristive devices. The topic was designed as a convenient reference point: it contains a mix of techniques starting from the fundamental manufacturing of memristive devices all the way to applications such as physically unclonable functions, and also covers perspectives on, e.g., in-memory computing, which is inextricably linked with emerging memory devices such as memristors. Finally, the reprint contains a few articles representing how other communities (from typical CMOS design to photonics) are fighting on their own fronts in the quest towards low-power computation, as a comparison with the memristor literature. We hope that readers will enjoy discovering the articles within

    Leveraging Signal Transfer Characteristics and Parasitics of Spintronic Circuits for Area and Energy-Optimized Hybrid Digital and Analog Arithmetic

    Get PDF
    While Internet of Things (IoT) sensors offer numerous benefits in diverse applications, they are limited by stringent constraints in energy, processing area and memory. These constraints are especially challenging within applications such as Compressive Sensing (CS) and Machine Learning (ML) via Deep Neural Networks (DNNs), which require dot product computations on large data sets. A solution to these challenges has been offered by the development of crossbar array architectures, enabled by recent advances in spintronic devices such as Magnetic Tunnel Junctions (MTJs). Crossbar arrays offer a compact, low-energy and in-memory approach to dot product computation in the analog domain by leveraging intrinsic signal-transfer characteristics of the embedded MTJ devices. The first phase of this dissertation research seeks to build on these benefits by optimizing resource allocation within spintronic crossbar arrays. A hardware approach to non-uniform CS is developed, which dynamically configures sampling rates by deriving necessary control signals using circuit parasitics. Next, an alternate approach to non-uniform CS based on adaptive quantization is developed, which reduces circuit area in addition to energy consumption. Adaptive quantization is then applied to DNNs by developing an architecture allowing for layer-wise quantization based on relative robustness levels. The second phase of this research focuses on extension of the analog computation paradigm by development of an operational amplifier-based arithmetic unit for generalized scalar operations. This approach allows for 95% area reduction in scalar multiplications, compared to the state-of-the-art digital alternative. Moreover, analog computation of enhanced activation functions allows for significant improvement in DNN accuracy, which can be harnessed through triple modular redundancy to yield 81.2% reduction in power at the cost of only 4% accuracy loss, compared to a larger network. Together these results substantiate promising approaches to several challenges facing the design of future IoT sensors within the targeted applications of CS and ML
    corecore