59 research outputs found

    CEPRAM: Compression for Endurance in PCM RAM

    Get PDF
    We deal with the endurance problem of Phase Change Memories (PCM) by proposing Compression for Endurance in PCM RAM (CEPRAM), a technique to elongate the lifespan of PCM-based main memory through compression. We introduce a total of three compression schemes based on already existent schemes, but targeting compression for PCM-based systems. We do a two-level evaluation. First, we quantify the performance of the compression, in terms of compressed size, bit-flips and how they are affected by errors. Next, we simulate these parameters in a statistical simulator to study how they affect the endurance of the system. Our simulation results reveal that our technique, which is built on top of Error Correcting Pointers (ECP) but using a high-performance cache-oriented compression algorithm modified to better suit our purpose, manages to further extend the lifetime of the memory system. In particular, it guarantees that at least half of the physical pages are in usable condition for 25% longer than ECP, which is slightly more than 5% more than a scheme that can correct 16 failures per block

    An Adaptive ECC Scheme for Runtime Write Failure Suppression of STT-RAM Cache

    Get PDF
    Spin-transfer torque random access memory (STT-RAM) features many attractive charac- teristics, including near-zero standby power, nanosecond access time, small footprint, etc. These properties make STT-RAM perfectly suitable for the applications that are subject to limited power and area budgets, i.e., on-chip cache. Write reliability is one of the major challenges in design of STT-RAM caches. To ensure design quality, error correction code (ECC) scheme is usually adopted in STT-RAM caches. However, it incurs significant hard- ware overhead. In observance of the dynamic error correcting requirements, in this work, we propose an adaptive ECC scheme to suppress the runtime write failures of STT-RAM cache with minimized hardware cost, in which the cache is partitioned into regions protected by different ECCs. The error rate of a data is speculated on-the-fly and the data is allocated to a partition that provides the needed error correcting capability. Moreover, to accom- modate the time-varying error correcting requirements of runtime data, the thresholds that determine data’s destination cache partition will be adaptively adjusted. Our experimental results show that compared to conventional ECC schemes, our scheme can save up to 80.2% ECC bit overhead with slightly degraded write reliability of the STT-RAM cache. Moreover, the detailed analysis shows that through simultaneous optimization in cache access patterns and reducing STT cell programming workload, our method outperforms conventional ECC design in power and energy consumptions

    Efficient scrub mechanisms for error-prone emerging memories

    Get PDF
    Journal ArticleMany memory cell technologies are being considered as possible replacements for DRAM and Flash technologies, both of which are nearing their scaling limits. While these new cells (PCM, STT-RAM, FeRAM, etc.) promise high density, better scaling, and non-volatility, they introduce new challenges. Solutions at the architecture level can help address some of these problems; e.g., prior re-search has proposed wear-leveling and hard error tolerance mechanisms to overcome the limited write endurance of PCM cells. In this paper, we focus on the soft error problem in PCM, a topic that has received little attention in the architecture community. Soft errors in DRAM memories are typically addressed by having SECDED support and a scrub mechanism. The scrub mechanism scans the memory looking for a single-bit error and corrects it be-fore the line experiences a second uncorrectable error. However, PCM (and other emerging memories) are prone to new sources of soft errors. In particular, multi-level cell (MLC) PCM devices will suffer from resistance drift, that increases the soft error rate and incurs high overheads for the scrub mechanism. This paper is the first to study the design of architectural scrub mechanisms, especially when tailored to the drift phenomenon in MLC PCM. Many of our solutions will also apply to other soft-error prone emerging memories. We first show that scrub overheads can be reduced with support for strong ECC codes and a lightweight error detection operation. We then design different scrub algorithms that can adaptively trade-off soft and hard errors. Using an approach that combines all proposed solutions, our scrub mechanism yields a 96.5% reduction in uncorrectable errors, a 24.4 Ă— decrease in scrub-related writes, and a 37.8% reduction in scrub energy, relative to a basic scrub algorithm used in modern DRAM systems

    Constructing Reliable Super Dense Phase Change Memory under Write Disturbance

    Get PDF
    Phase Change Memory (PCM) has better scalability and smaller cell size comparing to DRAM. However, further scaling PCM cell in deep sub-micron regime results in significant thermal based write disturbance. Naively allocating large inter-cell space increases cell size from ideal 4F^2 to 12F^2. While a recent work mitigates write disturbance along word-lines through disturbance resilient data encoding, which can shrink PCM cell size from 12F^2 to 8F^2, it is ineffective for write disturbance along bit-lines, which is more severe due to widely adopted uTrench structure in constructing PCM cell arrays. In this thesis, we propose SD-PCM, an architecture to achieve reliable write operations in Super Dense PCM. In particular, we focus on mitigating write disturbance along bit-lines such that we can construct super dense PCM chips with 4F^2 cell size, i.e., the minimal for diode-switch based PCM. Based on simple verification-n-correction (VnC), we propose LazyCorrection and PreRead to effectively reduce VnC overhead and minimize cascading verification during write. We further propose (n:m)-Alloc for achieving good tradeoff between VnC overhead minimization and memory capacity loss. Our experimental results show that, comparing to a write disturbance-free low density PCM, SD-PCM achieves 80% capacity improvement in cell arrays while incurring around 0-10% performance degradation when using different (n:m) allocators

    Balancing reliability, cost, and performance tradeoffs with FreeFault

    Full text link
    Abstract—Memory errors have been a major source of system failures and fault rates may rise even further as memory continues to scale. This increasing fault rate, especially when combined with advent of integrated on-package memories, may exceed the capabilities of traditional fault tolerance mecha-nisms or significantly increase their overhead. In this paper, we present FreeFault as a hardware-only, transparent, and nearly-free resilience mechanism that is implemented entirely within a processor and can tolerate the majority of DRAM faults. FreeFault repurposes portions of the last-level cache for storing retired memory regions and augments a hardware memory scrubber to monitor memory health and aid retirement decisions. Because it relies on existing structures (cache associativity) for retirement/remapping type repair, FreeFault has essentially no hardware overhead. Because it requires a very modest portion of the cache (as small as 8KB) to cover a large fraction of DRAM faults, FreeFault has almost no impact on performance. We explain how FreeFault adds an attractive layer in an overall resilience scheme of highly-reliable and highly-available systems by delaying, and even entirely avoiding, calling upon software to make tradeoff decisions between memory capacity, performance, and reliability. I

    Architectural Techniques for Multi-Level Cell Phase Change Memory Based Main Memory

    Get PDF
    Phase change memory (PCM) recently has emerged as a promising technology to meet the fast growing demand for large capacity main memory in modern computing systems. Multi-level cell (MLC) PCM storing multiple bits in a single cell offers high density with low per-byte fabrication cost. However, PCM suffers from long write latency, short cell endurance, limited write throughput and high peak power, which makes it challenging to be integrated in the memory hierarchy. To address the long write latency, I propose write truncation to reduce the number of write iterations with the assistance of an extra error correction code (ECC). I also propose form switch (FS) to reduce the storage overhead of the ECC. By storing highly compressible lines in single level cell (SLC) form, FS improves read latency as well. To attack the short cell endurance and large peak power, I propose elastic RESET (ER) to construct triple-level cell PCM. By reducing RESET energy, ER significantly reduces peak power and prolongs PCM lifetime. To improve the write concurrency, I propose fine-grained write power budgeting (FPB) observing a global power budget and regulates power across write iterations according to the step-down power demand of each iteration. A global charge pump is also integrated onto a DIMM to boost power for hot PCM chips while staying within the global power budget. To further reduce the peak power, I propose intra-write RESET scheduling distributing cell RESET initializations in the whole write operation duration, so that the on-chip charge pump size can also be reduced

    L2C2: Last-level compressed-contents non-volatile cache and a procedure to forecast performance and lifetime

    Get PDF
    Several emerging non-volatile (NV) memory technologies are rising as interesting alternatives to build the Last-Level Cache (LLC). Their advantages, compared to SRAM memory, are higher density and lower static power, but write operations wear out the bitcells to the point of eventually losing their storage capacity. In this context, this paper presents a novel LLC organization designed to extend the lifetime of the NV data array and a procedure to forecast in detail the capacity and performance of such an NV-LLC over its lifetime. From a methodological point of view, although different approaches are used in the literature to analyze the degradation of an NV-LLC, none of them allows to study in detail its temporal evolution. In this sense, this work proposes a forecasting procedure that combines detailed simulation and prediction, allowing an accurate analysis of the impact of different cache control policies and mechanisms (replacement, wear-leveling, compression, etc.) on the temporal evolution of the indices of interest, such as the effective capacity of the NV-LLC or the system IPC. We also introduce L2C2, a LLC design intended for implementation in NV memory technology that combines fault tolerance, compression, and internal write wear leveling for the first time. Compression is not used to store more blocks and increase the hit rate, but to reduce the write rate and increase the lifetime during which the cache supports near-peak performance. In addition, to support byte loss without performance drop, L2C2 inherently allows N redundant bytes to be added to each cache entry. Thus, L2C2+N, the endurance-scaled version of L2C2, allows balancing the cost of redundant capacity with the benefit of longer lifetime. For instance, as a use case, we have implemented the L2C2 cache with STT-RAM technology. It has affordable hardware overheads compared to that of a baseline NV-LLC without compression in terms of area, latency and energy consumption, and increases up to 6-37 times the time in which 50% of the effective capacity is degraded, depending on the variability in the manufacturing process. Compared to L2C2, L2C2+6 which adds 6 bytes of redundant capacity per entry, that means 9.1% of storage overhead, can increase up to 1.4-4.3 times the time in which the system gets its initial peak performance degraded

    The Design of A High Capacity and Energy Efficient Phase Change Main Memory

    Get PDF
    Higher energy-efficiency has become essential in servers for a variety of reasons that range from heavy power and thermal constraints, environmental issues and financial savings. With main memory responsible for at least 30% of the energy consumed by a server, a low power main memory is fundamental to achieving this energy efficiency DRAM has been the technology of choice for main memory for the last three decades primarily because it traditionally combined relatively low power, high performance, low cost and high density. However, with DRAM nearing its density limit, alternative low-power memory technologies, such as Phase-change memory (PCM), have become a feasible replacement. PCM limitations, such as limited endurance and low write performance, preclude simple drop-in replacement and require new architectures and algorithms to be developed. A PCM main memory architecture (PMMA) is introduced in this dissertation, utilizing both DRAM and PCM, to create an energy-efficient main memory that is able to replace a DRAM-only memory. PMMA utilizes a number of techniques and architectural changes to achieve a level of performance that is par with DRAM. PMMA achieves gains in energy-delay of up to 65%, with less than 5% of performance loss and extremely high energy gains. To address the other major shortcoming of PCM, namely limited endurance, a novel, low- overhead wear-leveling algorithm that builds on PMMA is proposed that increases the lifetime of PMMA to match the expected server lifetime so that both server and memory subsystems become obsolete at about the same time. We also study how to better use the excess capacity, traditionally available on PCM devices, to obtain the highest lifetime possible. We show that under specific endurance distributions, the naive choice does not achieve the highest lifetime. We devise rules that empower the designer to select algorithms and parameters to achieve higher lifetime or simplify the design knowing the impact on the lifetime. The techniques presented also apply to other storage class memories (SCM) memories that suffer from limited endurance
    • …
    corecore