180 research outputs found

    Computing with Spintronics: Circuits and architectures

    Get PDF
    This thesis makes the following contributions towards the design of computing platforms with spintronic devices. 1) It explores the use of spintronic memories in the design of a domain-specific processor for an emerging class of data-intensive applications, namely recognition, mining and synthesis (RMS). Two different spintronic memory technologies — Domain Wall Memory (DWM) and STT-MRAM — are utilized to realize the different levels in the memory hierarchy of the domain-specific processor, based on their respective access characteristics. Architectural tradeoffs created by the use of spintronic memories are analyzed. The proposed design achieves 1.5X-4X improvements in energy-delay product compared to a CMOS baseline. 2) It describes the first attempt to use DWM in the cache hierarchy of general-purpose processors. DWM promises unparalleled density by packing several bits of data into each bit-cell. TapeCache, the proposed DWM-based cache architecture, utilizes suitable circuit and architectural optimizations to address two key challenges (i) the high energy and latency requirement of write operations and (ii) the need for shift operations to access the data stored in each DWM bit-cell. At the circuit level, DWM bit-cells that are tailored to the distinct design requirements of different levels in the cache hierarchy are proposed. At the architecture level, TapeCache proposes suitable cache organization and management policies to alleviate the performance impact of shift operations required to access data stored in DWM bit-cells. TapeCache achieves more than 7X improvements in both cache area and energy with virtually identical performance compared to an SRAM-based cache hierarchy. 3) It investigates the design of the on-chip memory hierarchy of general-purpose graphics processing units (GPGPUs)—massively parallel processors that are optimized for data-intensive high-throughput workloads—using DWM. STAG, a high density, energy-efficient Spintronic- Tape Architecture for GPGPU cache hierarchies is described. STAG utilizes different DWM bit-cells to realize different memory arrays in the GPGPU cache hierarchy. To address the challenge of high access latencies due to shifts, STAG predicts upcoming cache accesses by leveraging unique characteristics of GPGPU architectures and workloads, and prefetches data that are both likely to be accessed and require large numbers of shift operations. STAG achieves 3.3X energy reduction and 12.1% performance improvement over CMOS SRAM under iso-area conditions. 4) While the potential of spintronic devices for memories is widely recognized, their utility in realizing logic is much less clear. The thesis presents Spintastic, a new paradigm that utilizes Stochastic Computing (SC) to realize spintronic logic. In SC, data is encoded in the form of pseudo-random bitstreams, such that the probability of a \u271\u27 in a bitstream corresponds to the numerical value that it represents. SC can enable compact, low-complexity logic implementations of various arithmetic functions. Spintastic establishes the synergy between stochastic computing and spin-based logic by demonstrating that they mutually alleviate each other\u27s limitations. On the one hand, various building blocks of SC, which incur significant overheads in CMOS implementations, can be efficiently realized by exploiting the physical characteristics of spin devices. On the other hand, the reduced logic complexity and low logic depth of SC circuits alleviates the shortcomings of spintronic logic. Based on this insight, the design of spin-based stochastic arithmetic circuits, bitstream generators, bitstream permuters and stochastic-to-binary converter circuits are presented. Spintastic achieves 7.1X energy reduction over CMOS implementations for a wide range of benchmarks from the image processing, signal processing, and RMS application domains. 5) In order to evaluate the proposed spintronic designs, the thesis describes various device-to-architecture modeling frameworks. Starting with devices models that are calibrated to measurements, the characteristics of spintronic devices are successively abstracted into circuit-level and architectural models, which are incorporated into suitable simulation frameworks. (Abstract shortened by UMI.

    In-memory computing with emerging memory devices: Status and outlook

    Get PDF
    Supporting data for "In-memory computing with emerging memory devices: status and outlook", submitted to APL Machine Learning

    Exploring Spin-transfer-torque devices and memristors for logic and memory applications

    Get PDF
    As scaling CMOS devices is approaching its physical limits, researchers have begun exploring newer devices and architectures to replace CMOS. Due to their non-volatility and high density, Spin Transfer Torque (STT) devices are among the most prominent candidates for logic and memory applications. In this research, we first considered a new logic style called All Spin Logic (ASL). Despite its advantages, ASL consumes a large amount of static power; thus, several optimizations can be performed to address this issue. We developed a systematic methodology to perform the optimizations to ensure stable operation of ASL. Second, we investigated reliable design of STT-MRAM bit-cells and addressed the conflicting read and write requirements, which results in overdesign of the bit-cells. Further, a Device/Circuit/Architecture co-design framework was developed to optimize the STT-MRAM devices by exploring the design space through jointly considering yield enhancement techniques at different levels of abstraction. Recent advancements in the development of memristive devices have opened new opportunities for hardware implementation of non-Boolean computing. To this end, the suitability of memristive devices for swarm intelligence algorithms has enabled researchers to solve a maze in hardware. In this research, we utilized swarm intelligence of memristive networks to perform image edge detection. First, we proposed a hardware-friendly algorithm for image edge detection based on ant colony. Next, we designed the image edge detection algorithm using memristive networks

    Energy efficient hybrid computing systems using spin devices

    Get PDF
    Emerging spin-devices like magnetic tunnel junctions (MTJ\u27s), spin-valves and domain wall magnets (DWM) have opened new avenues for spin-based logic design. This work explored potential computing applications which can exploit such devices for higher energy-efficiency and performance. The proposed applications involve hybrid design schemes, where charge-based devices supplement the spin-devices, to gain large benefits at the system level. As an example, lateral spin valves (LSV) involve switching of nanomagnets using spin-polarized current injection through a metallic channel such as Cu. Such spin-torque based devices possess several interesting properties that can be exploited for ultra-low power computation. Analog characteristic of spin current facilitate non-Boolean computation like majority evaluation that can be used to model a neuron. The magneto-metallic neurons can operate at ultra-low terminal voltage of ∼20mV, thereby resulting in small computation power. Moreover, since nano-magnets inherently act as memory elements, these devices can facilitate integration of logic and memory in interesting ways. The spin based neurons can be integrated with CMOS and other emerging devices leading to different classes of neuromorphic/non-Von-Neumann architectures. The spin-based designs involve `mixed-mode\u27 processing and hence can provide very compact and ultra-low energy solutions for complex computation blocks, both digital as well as analog. Such low-power, hybrid designs can be suitable for various data processing applications like cognitive computing, associative memory, and currentmode on-chip global interconnects. Simulation results for these applications based on device-circuit co-simulation framework predict more than ∼100x improvement in computation energy as compared to state of the art CMOS design, for optimal spin-device parameters

    Error Characterization and Correction Techniques for Reliable STT-RAM Designs

    Get PDF
    The concerns on the continuous scaling of mainstream memory technologies have motivated tremendous investment to emerging memories. Being a promising candidate, spin-transfer torque random access memory (STT-RAM) offers nanosecond access time comparable to SRAM, high integration density close to DRAM, non-volatility as Flash memory, and good scalability. It is well positioned as the replacement of SRAM and DRAM for on-chip cache and main memory applications. However, reliability issue continues being one of the major challenges in STT-RAM memory designs due to the process variations and unique thermal fluctuations, i.e., the stochastic resistance switching property of magnetic devices. In this dissertation, I decoupled the reliability issues as following three-folds: First, the characterization of STT-RAM operation errors often require expensive Monte-Carlo runs with hybrid magnetic-CMOS simulation steps, making it impracticable for architects and system designs; Second, the state of the art does not have sufficiently understanding on the unique reliability issue of STT-RAM, and conventional error correction codes (ECCs) cannot efficiently handle such errors; Third, while the information density of STT-RAM can be boosted by multi-level cell (MLC) design, the more prominent reliability concerns and the complicated access mechanism greatly limit its applications in memory subsystems. Thus, I present a novel through solution set to both characterize and tackle the above reliability challenges in STT-RAM designs. In the first part of the dissertation, I introduce a new characterization method that can accurately and efficiently capture the multi-variable design metrics of STT-RAM cells; Second, a novel ECC scheme, namely, content-dependent ECC (CD-ECC), is developed to combat the characterized asymmetric errors of STT-RAM at 0->1 and 1->0 bit flipping's; Third, I present a circuit-architecture design, namely state-restricted multi-level cell (SR-MLC) STT-RAM design, which simultaneously achieves high information density, good storage reliability and fast write speed, making MLC STT-RAM accessible for system designers under current technology node. Finally, I conclude that efficient robust (or ECC) designs for STT-RAM require a deep holistic understanding on three different levels-device, circuit and architecture. Innovative ECC schemes and their architectural applications, still deserve serious research and investigation in the near future

    Integrated Circuits/Microchips

    Get PDF
    With the world marching inexorably towards the fourth industrial revolution (IR 4.0), one is now embracing lives with artificial intelligence (AI), the Internet of Things (IoTs), virtual reality (VR) and 5G technology. Wherever we are, whatever we are doing, there are electronic devices that we rely indispensably on. While some of these technologies, such as those fueled with smart, autonomous systems, are seemingly precocious; others have existed for quite a while. These devices range from simple home appliances, entertainment media to complex aeronautical instruments. Clearly, the daily lives of mankind today are interwoven seamlessly with electronics. Surprising as it may seem, the cornerstone that empowers these electronic devices is nothing more than a mere diminutive semiconductor cube block. More colloquially referred to as the Very-Large-Scale-Integration (VLSI) chip or an integrated circuit (IC) chip or simply a microchip, this semiconductor cube block, approximately the size of a grain of rice, is composed of millions to billions of transistors. The transistors are interconnected in such a way that allows electrical circuitries for certain applications to be realized. Some of these chips serve specific permanent applications and are known as Application Specific Integrated Circuits (ASICS); while, others are computing processors which could be programmed for diverse applications. The computer processor, together with its supporting hardware and user interfaces, is known as an embedded system.In this book, a variety of topics related to microchips are extensively illustrated. The topics encompass the physics of the microchip device, as well as its design methods and applications

    Reconfigurable three-terminal logic devices using phase-change materials

    Get PDF
    Conventional solid-state and mass storage memories (such as SRAM, DRAM and the hard disk drive HDD) are facing many technological challenges to meet the ever-increasing demand for fast, low power and cheap data storage solutions. This is compounded by the current conventional computer architectures (such as the von Neumann architecture) with separate processing and storage functionalities and hence data transfer bottlenecks and increased silicon footprint. Beyond the von Neumann computer architecture, the combination of arithmetic-logic processing and (collocally) storage circuits provide a new and promising alternative for computer systems that overcome the many limitations of current technology. However, there are many technical challenges that face the implementation of universal blocks of both logic and memory functions using conventional silicon technology (transistor-transistor logic - TTL, and complementary metal oxide semiconductors - CMOS). Phase-change materials, such as Ge2Sb2Te5 (GST), provide a potential complement or replacement to these technologies to provide both processing and, collocally, storage capability. Existing research in phase-change memory technologies focused on two-terminal non-volatile devices for different memory and logic applications due to their ability to achieve logic-resistive switching in nanosecond time scale, their scalability down to few nanometer-scale cells, and low power requirements. To perform logic functionality, current two-terminal phase-change logic devices need to be connected in series or parallel circuits, and require sequential inputs to perform the required logic function (such as NAND and NOR). In this research programme, three-terminal (3T) non-volatile phase-change memories are proposed and investigated as potential alternative logic cells with simultaneous inputs as reconfigurable, non-volatile logic devices. A vertical 3T logic device structure is proposed in this work based on existing phase-change based memory cell architecture and original concept work by Ovshinsky. A comprehensive, multi-physics finite-element model of the vertical 3T device was constructed in Comsol Multiphysics. This model solves Laplace's equation for the electric potential due to the application of voltage sources. The calculated electric potential and fields provide the Joule heating source in the device, which is used to compute the temperature distribution through solution of the heat diffusion equation, which is necessary to activate the thermally-driven phase transition process. The physically realistic and computationally efficient nucleation- growth model was numerically implemented to model the phase change and resistance change in the Ge2Sb2Te5 (GST) phase-change material in the device, which is combined with the finite- element model using the Matlab programming interface. The changes in electrical and thermal conductivities in the GST region are taken into account following the thermally activated phase transformations between the amorphous-crystalline states using effective medium theory. To determine the appropriate voltage and temperature conditions for the SET and RESET operations, and to optimise the materials and thicknesses of the thermal and heating layers in the device, comprehensive steady-state parametric simulations were carried out using the finite-element multi-physics model. Simulations of transient cycles of writing (SET) and erasing (RESET) processes using appropriate voltage pulses were then carried out on the designed vertical 3T device to study the phase transformations for practical reconfigurable logic operations. The simulations indicated excellent resistance contrast between the logic 1 and 0 states, and successfully demonstrated the feasibility of programming the logic functions of NAND and NOR gates using this 3T configuration

    High-Density Solid-State Memory Devices and Technologies

    Get PDF
    This Special Issue aims to examine high-density solid-state memory devices and technologies from various standpoints in an attempt to foster their continuous success in the future. Considering that broadening of the range of applications will likely offer different types of solid-state memories their chance in the spotlight, the Special Issue is not focused on a specific storage solution but rather embraces all the most relevant solid-state memory devices and technologies currently on stage. Even the subjects dealt with in this Special Issue are widespread, ranging from process and design issues/innovations to the experimental and theoretical analysis of the operation and from the performance and reliability of memory devices and arrays to the exploitation of solid-state memories to pursue new computing paradigms
    • …
    corecore