42 research outputs found

    A 130nm 1Mb Embedded Phase Change Memory with 500kb/s Single Channel Write Throughput

    Get PDF
    Abstract A 130 nm 1Mb embedded phase change memory (PCM) has been achieved, requiring only three additional masks for phase change storage element, featuring 500 kb/s single channel write throughput and > 10 8 endurance. The prepare process has been optimized to reduce the cost and power. An 80 nm heat electrode has been prepared with 130 nm process. The optimal Read/Write circuit module is designed to realize the load/store function for PCM. The critical operation parameter is Reset/70 ns/2.5 mA and Set/1500 ns/1 mA, which means that the signal channel write throughput arrives to 500 kb/s

    Multi-port Memory Design for Advanced Computer Architectures

    Get PDF
    In this thesis, we describe and evaluate novel memory designs for multi-port on-chip and off-chip use in advanced computer architectures. We focus on combining multi-porting and evaluating the performance over a range of design parameters. Multi-porting is essential for caches and shared-data systems, especially multi-core System-on-chips (SOC). It can significantly increase the memory access throughput. We evaluate FinFET voltage-mode multi-port SRAM cells using different metrics including leakage current, static noise margin and read/write performance. Simulation results show that single-ended multi-port FinFET SRAMs with isolated read ports offer improved read stability and flexibility over classical double-ended structures at the expense of write performance. By increasing the size of the access transistors, we show that the single-ended multi-port structures can achieve equivalent write performance to the classical double-ended multi-port structure for 9% area overhead. Moreover, compared with CMOS SRAM, FinFET SRAM has better stability and standby power. We also describe new methods for the design of FinFET current-mode multi-port SRAM cells. Current-mode SRAMs avoid the full-swing of the bitline, reducing dynamic power and access time. However, that comes at the cost of voltage drop, which compromises stability. The design proposed in this thesis utilizes the feature of Independent Gate (IG) mode FinFET, which can leverage threshold voltage by controlling the back gate voltage, to merge two transistors into one through high-Vt and low-Vt transistors. This design not only reduces the voltage drop, but it also reduces the area in multi-port current-mode SRAM design. For off-chip memory, we propose a novel two-port 1-read, 1-write (1R1W) phasechange memory (PCM) cell, which significantly reduces the probability of blocking at the bank levels. Different from the traditional PCM cell, the access transistors are at the top and connected to the bitline. We use Verilog-A to model the behavior of Ge2Sb2Te5 (GST: the storage component). We evaluate the performance of the two-port cell by transistor sizing and voltage pumping. Simulation results show that pMOS transistor is more practical than nMOS transistor as the access device when both area and power are considered. The estimated area overhead is 1.7�, compared to single-port PCM cell. In brief, the contribution we make in this thesis is that we propose and evaluate three different kinds of multi-port memories that are favorable for advanced computer architectures

    Optimization of the Phase Change Random Access Memory Employing Phase Change Materials

    Get PDF
    Phase-change random access memory (PCRAM) is a semiconductor device based on phase change material (PCM). The SET speed is the bottleneck of limiting the speed of PCRAM. Extract the electrical parameters of the SET operation of the PCRAM test chip and analyze the process of the SET operations. It is found that adding a high and narrow pulse before a single pulse (SP) benefits the SET resistance reduction and the SET speed improvement. A dual pulses SET (D-SET) method is proposed and optimized. The mechanism of D-SET is that the first pulse forms a large optimum temperature field cover over all regions of the PCM material. When the first pulse is converted to the second pulse, the optimum temperature field shrinks and causes the amorphous regions to rapidly crystallize from the edge to the center. On the 40 nm PCRAM test chip, the SET time of D-SET method is under 300 ns. Compared with the conventional SET method such as SP and staircase down pulses (SCD), the D-SET method is optimal for SET performance such as SET resistance distribution, SET speed, and the anti-drift ability

    Letter from the Special Issue Editor

    Get PDF
    Editorial work for DEBULL on a special issue on data management on Storage Class Memory (SCM) technologies

    An Adaptive ECC Scheme for Runtime Write Failure Suppression of STT-RAM Cache

    Get PDF
    Spin-transfer torque random access memory (STT-RAM) features many attractive charac- teristics, including near-zero standby power, nanosecond access time, small footprint, etc. These properties make STT-RAM perfectly suitable for the applications that are subject to limited power and area budgets, i.e., on-chip cache. Write reliability is one of the major challenges in design of STT-RAM caches. To ensure design quality, error correction code (ECC) scheme is usually adopted in STT-RAM caches. However, it incurs significant hard- ware overhead. In observance of the dynamic error correcting requirements, in this work, we propose an adaptive ECC scheme to suppress the runtime write failures of STT-RAM cache with minimized hardware cost, in which the cache is partitioned into regions protected by different ECCs. The error rate of a data is speculated on-the-fly and the data is allocated to a partition that provides the needed error correcting capability. Moreover, to accom- modate the time-varying error correcting requirements of runtime data, the thresholds that determine data’s destination cache partition will be adaptively adjusted. Our experimental results show that compared to conventional ECC schemes, our scheme can save up to 80.2% ECC bit overhead with slightly degraded write reliability of the STT-RAM cache. Moreover, the detailed analysis shows that through simultaneous optimization in cache access patterns and reducing STT cell programming workload, our method outperforms conventional ECC design in power and energy consumptions

    Improving Phase Change Memory (PCM) and Spin-Torque-Transfer Magnetic-RAM (STT-MRAM) as Next-Generation Memories: A Circuit Perspective

    Get PDF
    In the memory hierarchy of computer systems, the traditional semiconductor memories Static RAM (SRAM) and Dynamic RAM (DRAM) have already served for several decades as cache and main memory. With technology scaling, they face increasingly intractable challenges like power, density, reliability and scalability. As a result, they become less appealing in the multi/many-core era with ever increasing size and memory-intensity of working sets. Recently, there is an increasing interest in using emerging non-volatile memory technologies in replacement of SRAM and DRAM, due to their advantages like non-volatility, high device density, near-zero cell leakage and resilience to soft errors. Among several new memory technologies, Phase Change Memory (PCM) and Spin-Torque-Transfer Magnetic-RAM (STT-MRAM) are most promising candidates in building main memory and cache, respectively. However, both of them possess unique limitations that preventing them from being effectively adopted. In this dissertation, I present my circuit design work on tackling the limitations of PCM and STT-MRAM. At bit level, both PCM and STT-MRAM suffer from excessive write energy, and PCM has very limited write endurance. For PCM, I implement Differential Write to remove large number of unnecessary bit-writes that do not alter the stored data. It is then extended to STT-MRAM as Early Write Termination, with specific optimizations to eliminate the overhead of pre-write read. At array level, PCM enjoys high density but could not provide competitive throughput due to its long write latency and limited number of read/write circuits. I propose a Pseudo-Multi-Port Bank design to exploit intra-bank parallelism by recycling and reusing shared peripheral circuits between accesses in a time-multiplexed manner. On the other hand, although STT-MRAM features satisfactory throughput, its conventional array architecture is constrained on density and scalability by the pitch of the per-column bitline pair. I propose a Common-Source-Line Array architecture which uses a shared source-line along the row, essentially leaving only one bitline per column. For these techniques, I provide circuit level analyses as well as architecture/system level and/or process/device level discussions. In addition, relevant background and work are thoroughly surveyed and potential future research topics are discussed, offering insights and prospects of these next-generation memories

    Architectural Techniques for Multi-Level Cell Phase Change Memory Based Main Memory

    Get PDF
    Phase change memory (PCM) recently has emerged as a promising technology to meet the fast growing demand for large capacity main memory in modern computing systems. Multi-level cell (MLC) PCM storing multiple bits in a single cell offers high density with low per-byte fabrication cost. However, PCM suffers from long write latency, short cell endurance, limited write throughput and high peak power, which makes it challenging to be integrated in the memory hierarchy. To address the long write latency, I propose write truncation to reduce the number of write iterations with the assistance of an extra error correction code (ECC). I also propose form switch (FS) to reduce the storage overhead of the ECC. By storing highly compressible lines in single level cell (SLC) form, FS improves read latency as well. To attack the short cell endurance and large peak power, I propose elastic RESET (ER) to construct triple-level cell PCM. By reducing RESET energy, ER significantly reduces peak power and prolongs PCM lifetime. To improve the write concurrency, I propose fine-grained write power budgeting (FPB) observing a global power budget and regulates power across write iterations according to the step-down power demand of each iteration. A global charge pump is also integrated onto a DIMM to boost power for hot PCM chips while staying within the global power budget. To further reduce the peak power, I propose intra-write RESET scheduling distributing cell RESET initializations in the whole write operation duration, so that the on-chip charge pump size can also be reduced
    corecore