Search CORE

31 research outputs found

Towards Energy-Efficient and Reliable Computing: From Highly-Scaled CMOS Devices to Resistive Memories

Author: Salehi Mobarakeh Soheil
Publication venue: University of Central Florida
Publication date: 01/01/2016
Field of study

The continuous increase in transistor density based on Moore\u27s Law has led us to highly scaled Complementary Metal-Oxide Semiconductor (CMOS) technologies. These transistor-based process technologies offer improved density as well as a reduction in nominal supply voltage. An analysis regarding different aspects of 45nm and 15nm technologies, such as power consumption and cell area to compare these two technologies is proposed on an IEEE 754 Single Precision Floating-Point Unit implementation. Based on the results, using the 15nm technology offers 4-times less energy and 3-fold smaller footprint. New challenges also arise, such as relative proportion of leakage power in standby mode that can be addressed by post-CMOS technologies. Spin-Transfer Torque Random Access Memory (STT-MRAM) has been explored as a post-CMOS technology for embedded and data storage applications seeking non-volatility, near-zero standby energy, and high density. Towards attaining these objectives for practical implementations, various techniques to mitigate the specific reliability challenges associated with STT-MRAM elements are surveyed, classified, and assessed herein. Cost and suitability metrics assessed include the area of nanomagmetic and CMOS components per bit, access time and complexity, Sense Margin (SM), and energy or power consumption costs versus resiliency benefits. In an attempt to further improve the Process Variation (PV) immunity of the Sense Amplifiers (SAs), a new SA has been introduced called Adaptive Sense Amplifier (ASA). ASA can benefit from low Bit Error Rate (BER) and low Energy Delay Product (EDP) by combining the properties of two of the commonly used SAs, Pre-Charge Sense Amplifier (PCSA) and Separated Pre-Charge Sense Amplifier (SPCSA). ASA can operate in either PCSA or SPCSA mode based on the requirements of the circuit such as energy efficiency or reliability. Then, ASA is utilized to propose a novel approach to actually leverage the PV in Non-Volatile Memory (NVM) arrays using Self-Organized Sub-bank (SOS) design. SOS engages the preferred SA alternative based on the intrinsic as-built behavior of the resistive sensing timing margin to reduce the latency and power consumption while maintaining acceptable access time

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Architecting Memory Systems for Emerging Technologies

Author: Oh Byoungchan
Publication venue
Publication date: 01/01/2018
Field of study

The advance of traditional dynamic random access memory (DRAM) technology has slowed down, while the capacity and performance needs of memory system have continued to increase. This is a result of increasing data volume from emerging applications, such as machine learning and big data analytics. In addition to such demands, increasing energy consumption is becoming a major constraint on the capabilities of computer systems. As a result, emerging non-volatile memories, for example, Spin Torque Transfer Magnetic RAM (STT-MRAM), and new memory interfaces, for example, High Bandwidth Memory (HBM), have been developed as an alternative. Thus far, most previous studies have retained a DRAM-like memory architecture and management policy. This preserves compatibility but hides the true benefits of those new memory technologies. In this research, we proposed the co-design of memory architectures and their management policies for emerging technologies. First, we introduced a new memory architecture for an STT-MRAM main memory. In particular, we defined a new page mode operation for efficient activation and sensing. By fully exploiting the non-destructive nature of STT- MRAM, our design achieved higher performance, lower energy consumption, and a smaller area than the traditional designs. Second, we developed a cost-effective technique to improve load balancing for HBM memory channels. We showed that the proposed technique was capable of efficiently redistributing memory requests across multiple memory channels to improve the channel utilization, resulting in improved performance.PHDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/145988/1/bcoh_1.pd

Deep Blue Documents at the University of Michigan

Energy-Aware Data Movement In Non-Volatile Memory Hierarchies

Author: Najafabadi Navid Khoshavi
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2017
Field of study

While technology scaling enables increased density for memory cells, the intrinsic high leakage power of conventional CMOS technology and the demand for reduced energy consumption inspires the use of emerging technology alternatives such as eDRAM and Non-Volatile Memory (NVM) including STT-MRAM, PCM, and RRAM. The utilization of emerging technology in Last Level Cache (LLC) designs which occupies a signifcant fraction of total die area in Chip Multi Processors (CMPs) introduces new dimensions of vulnerability, energy consumption, and performance delivery. To be specific, a part of this research focuses on eDRAM Bit Upset Vulnerability Factor (BUVF) to assess vulnerable portion of the eDRAM refresh cycle where the critical charge varies depending on the write voltage, storage and bit-line capacitance. This dissertation broaden the study on vulnerability assessment of LLC through investigating the impact of Process Variations (PV) on narrow resistive sensing margins in high-density NVM arrays, including on-chip cache and primary memory. Large-latency and power-hungry Sense Amplifers (SAs) have been adapted to combat PV in the past. Herein, a novel approach is proposed to leverage the PV in NVM arrays using Self-Organized Sub-bank (SOS) design. SOS engages the preferred SA alternative based on the intrinsic as-built behavior of the resistive sensing timing margin to reduce the latency and power consumption while maintaining acceptable access time. On the other hand, this dissertation investigates a novel technique to prioritize the service to 1) Extensive Read Reused Accessed blocks of the LLC that are silently dropped from higher levels of cache, and 2) the portion of the working set that may exhibit distant re-reference interval in L2. In particular, we develop a lightweight Multi-level Access History Profiler to effciently identify ERRA blocks through aggregating the LLC block addresses tagged with identical Most Signifcant Bits into a single entry. Experimental results indicate that the proposed technique can reduce the L2 read miss ratio by 51.7% on average across PARSEC and SPEC2006 workloads. In addition, this dissertation will broaden and apply advancements in theories of subspace recovery to pioneer computationally-aware in-situ operand reconstruction via the novel Logic In Interconnect (LI2) scheme. LI2 will be developed, validated, and re?ned both theoretically and experimentally to realize a radically different approach to post-Moore\u27s Law computing by leveraging low-rank matrices features offering data reconstruction instead of fetching data from main memory to reduce energy/latency cost per data movement. We propose LI2 enhancement to attain high performance delivery in the post-Moore\u27s Law era through equipping the contemporary micro-architecture design with a customized memory controller which orchestrates the memory request for fetching low-rank matrices to customized Fine Grain Reconfigurable Accelerator (FGRA) for reconstruction while the other memory requests are serviced as before. The goal of LI2 is to conquer the high latency/energy required to traverse main memory arrays in the case of LLC miss, by using in-situ construction of the requested data dealing with low-rank matrices. Thus, LI2 exchanges a high volume of data transfers with a novel lightweight reconstruction method under specific conditions using a cross-layer hardware/algorithm approach

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Spin-Transfer-Torque (STT) Devices for On-chip Memory and Their Applications to Low-standby Power Systems

Author: Seo Yeongkyo
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2016
Field of study

With the scaling of CMOS technology, the proportion of the leakage power to total power consumption increases. Leakage may account for almost half of total power consumption in high performance processors. In order to reduce the leakage power, there is an increasing interest in using nonvolatile storage devices for memory applications. Among various promising nonvolatile memory elements, spin-transfer torque magnetic RAM (STT-MRAM) is identified as one of the most attractive alternatives to conventional SRAM. However, several design challenges of STT-MRAM such as shared read and write current paths, single-ended sensing, and high dynamic power are major challenges to be overcome to make it suitable for on-chip memories. To mitigate such problems, we propose a domain wall coupling based spin-transfer torque (DWCSTT) device for on-chip caches. Our proposed DWCSTT bit-cell decouples the read and the write current paths by the electrically-insulating magnetic coupling layer so that we can separately optimize read operation without having an impact on write-ability. In addition, the complementary polarizer structure in the read path of the DWCSTT device allows DWCSTT to enable self-referenced differential sensing. DWCSTT bit-cells improve the write power consumption due to the low electrical resistance of the write current path. Furthermore, we also present three different bit-cell level design techniques of Spin-Orbit Torque MRAM (SOT-MRAM) for alleviating some of the inefficiencies of conventional magnetic memories while maintaining the advantages of spin-orbit torque (SOT) based novel switching mechanism such as low write current requirement and decoupled read and write current path. Our proposed SOT-MRAM with supporting dual read/write ports (1R/1W) can address the issue of high-write latency of STT-MRAM by simultaneous 1R/1W accesses. Second, we propose a new type of SOT-MRAM which uses only one access transistor along with a Schottky diode in order to mitigate the area-overhead caused by two access transistors in conventional SOT-MRAM. Finally, a new design technique of SOT-MRAM is presented to improve the integration density by utilizing a shared bit-line structure

Purdue E-Pubs

STT-MRAM characterization and its test implications

Author: Radhakrishnan Govindakrishnan
Publication venue: 'University of Waterloo'
Publication date: 22/04/2020
Field of study

Spin torque transfer (STT)-magnetoresistive random-access memory (MRAM) has come a long way in research to meet the speed and power consumption requirements for future memory applications. The state-of-the-art STT-MRAM bit-cells employ magnetic tunnel junction (MTJ) with perpendicular magnetic anisotropy (PMA). The process repeatabil- ity and yield stability for wafer fabrication are some of the critical issues encountered in STT-MRAM mass production. Some of the yield improvement techniques to combat the e ect of process variations have been previously explored. However, little research has been done on defect oriented testing of STT-MRAM arrays. In this thesis, the author investi- gates the parameter deviation and non-idealities encountered during the development of a novel MTJ stack con guration. The characterization result provides motivation for the development of the design for testability (DFT) scheme that can help test and characterize STT-MRAM bit-cells and the CMOS peripheral circuitry e ciently. The primary factors for wafer yield degradation are the device parameter variation and its non-uniformity across the wafer due to the fabrication process non-idealities. There- fore, e ective in-process testing strategies for exploring and verifying the impact of the parameter variation on the wafer yield will be needed to achieve fabrication process opti- mization. While yield depends on the CMOS process variability, quality of the deposited MTJ lm, and other process non-idealities, test platform can enable parametric optimiza- tion and veri cation using the CMOS-based DFT circuits. In this work, we develop a DFT algorithm and implement a DFT circuit for parametric testing and prequali cation of the critical circuits in the CMOS wafer. The DFT circuit successfully replicates the electrical characteristics of MTJ devices and captures their spatial variation across the wafer with an error of less than 4%. We estimate the yield of the read sensing path by implement- ing the DFT circuit, which can replicate the resistance-area product variation up to 50% from its nominal value. The yield data from the read sensing path at di erent wafer loca- tions are analyzed, and a usable wafer radius has been estimated. Our DFT scheme can provide quantitative feedback based on in-die measurement, enabling fabrication process optimization through iterative estimation and veri cation of the calibrated parameters. Another concern that prevents mass production of STT-MRAM arrays is the defect formation in MTJ devices due to aging. Identifying manufacturing defects in the magnetic tunnel junction (MTJ) device is crucial for the yield and reliability of spin-torque-transfer (STT) magnetic random-access memory (MRAM) arrays. Several of the MTJ defects result in parametric deviations of the device that deteriorate over time. We extend our work on the DFT scheme by monitoring the electrical parameter deviations occurring due to the defect formation over time. A programmable DFT scheme was implemented for a sub-arrayin 65 nm CMOS technology to evaluate the feasibility of the test scheme. The scheme utilizes the read sense path to compare the bit-cell electrical parameters against known DFT cells characteristics. Built-in-self-test (BIST) methodology is utilized to trigger the onset of the fault once the device parameter crosses a threshold value. We demonstrate the operation and evaluate the accuracy of detection with the proposed scheme. The DFT scheme can be exploited for monitoring aging defects, modeling their behavior and optimization of the fabrication process. DFT scheme could potentially nd numerous applications for parametric characteriza- tion and fault monitoring of STT-MRAM bit-cell arrays during mass production. Some of the applications include a) Fabrication process feedback to improve wafer turnaround time, b) STT-MRAM bit-cell health monitoring, c) Decoupled characterization of the CMOS pe- ripheral circuitry such as read-sensing path and sense ampli er characterization within the STT-MRAM array. Additionally, the DFT scheme has potential applications for detec- tion of fault formation that could be utilized for deploying redundancy schemes, providing a graceful degradation in MTJ-based bit-cell array due to aging of the device, and also providing feedback to improve the fabrication process and yield learning

University of Waterloo's Institutional Repository

High-Density Solid-State Memory Devices and Technologies

Author
Publication venue: 'MDPI AG'
Publication date: 06/05/2022
Field of study

This Special Issue aims to examine high-density solid-state memory devices and technologies from various standpoints in an attempt to foster their continuous success in the future. Considering that broadening of the range of applications will likely offer different types of solid-state memories their chance in the spotlight, the Special Issue is not focused on a specific storage solution but rather embraces all the most relevant solid-state memory devices and technologies currently on stage. Even the subjects dealt with in this Special Issue are widespread, ranging from process and design issues/innovations to the experimental and theoretical analysis of the operation and from the performance and reliability of memory devices and arrays to the exploitation of solid-state memories to pursue new computing paradigms

Directory of Open Access Books (DOAB)

Domain walls in spin-valve nanotracks: characterisation and applications

Author: Sampaio Joao Miguel Ramos Melo
Sampaio Joao Miguel Ramos Melo
Publication venue: Physics, Imperial College London
Publication date: 01/11/2011
Field of study

Magnetic systems based on the manipulation of domain walls (DWs) in nanometre-scaled tracks have been shown to store data at high density, perform complex logic operations, and even mechanically manipulate magnetic beads. The magnetic nano-track has also been an indispensable model system to study fundamental magnetic and magneto-electronic phenomena, such as field induced DW propagation, spin-transfer torque, and other micromagnetic properties. Its value to fundamental research and the breath of potentially useful applications have made this class of systems the focus of wide research in the area of nanomagnetism and spintronics. This thesis focuses on DW manipulation and DW-based devices in spin-valve nanotracks. The spin-valve is a metallic multi-layered spintronic structure, wherein the electrical resistance varies greatly with the magnetisation of its layers. In comparison to monolayer tracks, the spin-valve track enables more sensitive and versatile measurements, as well as demonstrating electronic output of DW-based devices, an achievement of crucial interest to technological applications. However, these multi-layered tracks introduce new, potentially disruptive magnetic interactions, as well as fabrication challenges. In this thesis, the DW propagation in spin-valve nanotracks of different compositions was studied, and a system with DW propagation properties comparable to the state-of-the-art in monolayer tracks was demonstrated, down to an unprecedented lateral size of 33nm. Several DW logic devices of variable complexity were demonstrated and studied, namely a turn-counting DW spiral, a DW gate, multiple DW logic NOT gates, and a DW-DW interactor. It was found that, where the comparison was possible, the overall magnetic behaviour of these devices was analogous to that of monolayer structures, and the device performance, as defined by the range of field wherein they function desirably, was found to be comparable, albeit inferior, to that of their monolayer counterparts. The interaction between DWs in adjacent tracks was studied and new phenomena were observed and characterised, such as DW depinning induced by a static or travelling adjacent DW. The contribution of different physical mechanisms to electrical current induced depinning were quantified, and it was found that the Oersted field, typically negligible in monolayer tracks, was responsible for large variations in depinning field in SV tracks, and that the strength of spin-transfer effect was similar in magnitude to that reported in monolayer tracks. Finally, current induced ferromagnetic resonance was measured, and the domain uniform resonant mode was observed, in very good agreement to Kittel theory and simulations

Spiral - Imperial College Digital Repository

Cellular magnetic devices : using magnetostatically interacting elements for logic, memory, and communication

Author: Mourik van, R.A.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2015
Field of study

Repository TU/e

Pure OAI Repository