20 research outputs found

    Bit-Write-Reducing and Error-Correcting Code Generation Methods for Non-Volatile Memories

    Get PDF
    早大学位記番号:新8124早稲田大

    High-Performance Energy-Efficient and Reliable Design of Spin-Transfer Torque Magnetic Memory

    Get PDF
    In this dissertation new computing paradigms, architectures and design philosophy are proposed and evaluated for adopting the STT-MRAM technology as highly reliable, energy efficient and fast memory. For this purpose, a novel cross-layer framework from the cell-level all the way up to the system- and application-level has been developed. In these framework, the reliability issues are modeled accurately with appropriate fault models at different abstraction levels in order to analyze the overall failure rates of the entire memory and its Mean Time To Failure (MTTF) along with considering the temperature and process variation effects. Design-time, compile-time and run-time solutions have been provided to address the challenges associated with STT-MRAM. The effectiveness of the proposed solutions is demonstrated in extensive experiments that show significant improvements in comparison to state-of-the-art solutions, i.e. lower-power, higher-performance and more reliable STT-MRAM design

    Coset Coding to Extend the Lifetime of Non-Volatile Memory

    Get PDF
    <p>Modern computing systems are increasingly integrating both Phase Change Memory (PCM) and Flash memory technologies into computer systems being developed today, yet the lifetime of these technologies is limited by the number of times cells are written. Due to their limited lifetime, PCM and Flash may wear-out before other parts of the system. The objective of this dissertation is to increase the lifetime of memory locations composed of either PCM or Flash cells using coset coding. </p><p>For PCM, we extend memory lifetime by using coset coding to reduce the number of bit-flips per write compared to un-coded writes. Flash program/erase operation cycle degrades page lifetime; we extend the lifetime of Flash memory cells by using coset coding to re-program a page multiple times without erasing. We then show how coset coding can be integrated into Flash solid state drives.</p><p>We ran simulations to evaluate the effectiveness of using coset coding to extend PCM and Flash lifetime. We simulated writes to PCM and found that in our simulations coset coding can be used to increase PCM lifetime by up to 3x over writing un-coded data directly to the memory location. We extended the lifetime of Flash using coset coding to re-write pages without an intervening erase and were able to re-write a single Flash page using coset coding more times than when writing un-coded data or using prior coding work for the same area overhead. We also found in our simulations that using coset coding in a Flash SSD results in higher lifetime for a given area overhead compared to un-coded writes.</p>Dissertatio

    Coding for Phase Change Memory Performance Optimization

    Get PDF
    Over the past several decades, memory technologies have exploited continual scaling of CMOS to drastically improve performance and cost. Unfortunately, charge-based memories become unreliable beyond 20 nm feature sizes. A promising alternative is Phase-Change-Memory (PCM) which leverages scalable resistive thermal mechanisms. To realize PCM's potential, a number of challenges, including the limited wear-endurance and costly writes, need to be addressed. This thesis introduces novel methodologies for encoding data on PCM which exploit asymmetries in read/write performance to minimize memory's wear/energy consumption. First, we map the problem to a distance-based graph clustering problem and prove it is NP-hard. Next, we propose two different approaches: an optimal solution based on Integer-Linear-Programming, and an approximately-optimal solution based on Dynamic-Programming. Our methods target both single-level and multi-level cell PCM and provide further optimizations for stochastically-distributed data. We devise a low overhead hardware architecture for the encoder. Evaluations demonstrate significant performance gains of our framework

    Dependable Embedded Systems

    Get PDF
    This Open Access book introduces readers to many new techniques for enhancing and optimizing reliability in embedded systems, which have emerged particularly within the last five years. This book introduces the most prominent reliability concerns from today’s points of view and roughly recapitulates the progress in the community so far. Unlike other books that focus on a single abstraction level such circuit level or system level alone, the focus of this book is to deal with the different reliability challenges across different levels starting from the physical level all the way to the system level (cross-layer approaches). The book aims at demonstrating how new hardware/software co-design solution can be proposed to ef-fectively mitigate reliability degradation such as transistor aging, processor variation, temperature effects, soft errors, etc. Provides readers with latest insights into novel, cross-layer methods and models with respect to dependability of embedded systems; Describes cross-layer approaches that can leverage reliability through techniques that are pro-actively designed with respect to techniques at other layers; Explains run-time adaptation and concepts/means of self-organization, in order to achieve error resiliency in complex, future many core systems

    Secure, Reliable, and Energy-efficient Phase Change Main Memory

    Get PDF
    Recent trends in supercomputing, shared cloud computing, and “big data” applications are placing ever-greater demands on computing systems and their memories. Such applications can readily saturate memory bandwidth, and often operate on a working set which exceeds the capacity of modern DRAM packages. Unfortunately, this demand is not matched by DRAM technology development. As Moore’s Law slows and Dennard Scaling stops, further density improvements in DRAM and the underlying semiconductor devices are difficult [1]. In anticipation of this limitation, researchers have pursued emerging memory technologies that promise higher density than conventional DRAM devices. One such technology in phase-change memory (PCM) is especially desirable due to its increased density relative to DRAM. However, this nascent memory has outstanding challenges to overcome before it is viable as a DRAM replacement. PCM devices have limited write endurance, and can consume more energy than their DRAM counterparts, necessitating careful control of how and how often they are written. A second challenge is the non-volatile nature of PCM devices; many applications rely on the volatility of DRAM to protect security critical applications and operating system address space between accesses and power cycles. An obvious solution is to encrypt the memory, but the effective randomization of data is at odds with techniques which reduce writes to the underlying memory. This body of work presents three contributions for addressing all challenges simultaneously under the assumption that encryption is required. Using an encryption and encoding technique called CASTLE & TOWERs, PCM can be employed as main memory with up to 30× improvement in device lifetime while opportunistically reducing dynamic energy. A second technique called MACE marries encoding and traditional error-correction schemes providing up to 2.6× improvement in device lifetime alongside a whole-lifetime energy evaluation framework to guide system design. Finally, an architecture called WINDU is presented which supports the application of encoding for an emerging encryption standard with an eye on energy efficiency. Together, these techniques advance the state-of-the-art, and offer a significant step toward the adoption of PCM as a main memory

    Fault and Defect Tolerant Computer Architectures: Reliable Computing With Unreliable Devices

    Get PDF
    This research addresses design of a reliable computer from unreliable device technologies. A system architecture is developed for a fault and defect tolerant (FDT) computer. Trade-offs between different techniques are studied and yield and hardware cost models are developed. Fault and defect tolerant designs are created for the processor and the cache memory. Simulation results for the content-addressable memory (CAM)-based cache show 90% yield with device failure probabilities of 3 x 10(-6), three orders of magnitude better than non fault tolerant caches of the same size. The entire processor achieves 70% yield with device failure probabilities exceeding 10(-6). The required hardware redundancy is approximately 15 times that of a non-fault tolerant design. While larger than current FT designs, this architecture allows the use of devices much more likely to fail than silicon CMOS. As part of model development, an improved model is derived for NAND Multiplexing. The model is the first accurate model for small and medium amounts of redundancy. Previous models are extended to account for dependence between the inputs and produce more accurate results

    Bit-Flip Aware Data Structures for Phase Change Memory

    Get PDF
    Big, non-volatile, byte-addressable, low-cost, and fast non-volatile memories like Phase Change Memory are appearing in the marketplace. They have the capability to unify both memory and storage and allow us to rethink the present memory hierarchy. An important draw-back to Phase Change Memory is limited write-endurance. In addition, Phase Change Memory shares with other Non-Volatile Random Access Memories an asym- metry in the energy costs of writes and reads. Best use of Non-Volatile Random Access Memories limits the number of times a Non-Volatile Random Access Memory cell changes contents, called a bit-flip. While the future of main memory is still unknown, we should already start to create data structures for them in order to shape the future era. This thesis investigates the creation of bit-flip aware data structures.The thesis first considers general ways in which a data structure can save bit- flips by smart overwrites and by using the exclusive-or of pointers. It then shows how a simple content dependent encoding can reduce bit-flips for web corpora. It then shows how to build hash based dictionary structures for Linear Hashing and Spiral Storage. Finally, the thesis presents Gray counters, close to bit-flip optimal counters that even enable age- based wear leveling with counters managed by the Non-Volatile Random Access Memories themselves instead of by the Operating Systems

    Co-designing reliability and performance for datacenter memory

    Get PDF
    Memory is one of the key components that affects reliability and performance of datacenter servers. Memory in today’s servers is organized and shared in several ways to provide the most performant and efficient access to data. For example, cache hierarchy in multi-core chips to reduce access latency, non-uniform memory access (NUMA) in multi-socket servers to improve scalability, disaggregation to increase memory capacity. In all these organizations, hardware coherence protocols are used to maintain memory consistency of this shared memory and implicitly move data to the requesting cores. This thesis aims to provide fault-tolerance against newer models of failure in the organization of memory in datacenter servers. While designing for improved reliability, this thesis explores solutions that can also enhance performance of applications. The solutions build over modern coherence protocols to achieve these properties. First, we observe that DRAM memory system failure rates have increased, demanding stronger forms of memory reliability. To combat this, the thesis proposes Dvé, a hardware driven replication mechanism where data blocks are replicated across two different memory controllers in a cache-coherent NUMA system. Data blocks are accompanied by a code with strong error detection capabilities so that when an error is detected, correction is performed using the replica. Dvé’s organization offers two independent points of access to data which enables: (a) strong error correction that can recover from a range of faults affecting any of the components in the memory and (b) higher performance by providing another nearer point of memory access. Dvé’s coherent replication keeps the replicas in sync for reliability and also provides coherent access to read replicas during fault-free operation for improved performance. Dvé can flexibly provide these benefits on-demand at runtime. Next, we observe that the coherence protocol itself requires to be hardened against failures. Memory in datacenter servers is being disaggregated from the compute servers into dedicated memory servers, driven by standards like CXL. CXL specifies the coherence protocol semantics for compute servers to access and cache data from a shared region in the disaggregated memory. However, the CXL specification lacks the requisite level of fault-tolerance necessary to operate at an inter-server scale within the datacenter. Compute servers can fail or be unresponsive in the datacenter and therefore, it is important that the coherence protocol remain available in the presence of such failures. The thesis proposes Āpta, a CXL-based, shared disaggregated memory system for keeping the cached data consistent without compromising availability in the face of compute server failures. Āpta architects a high-performance fault-tolerant object-granular memory server that significantly improves performance for stateless function-as-a-service (FaaS) datacenter applications

    Using Rollback Avoidance to Mitigate Failures in Next-Generation Extreme-Scale Systems

    Get PDF
    High-performance computing (HPC) systems enable scientists to numerically model complex phenomena in many important physical systems. The next major milestone in the development of HPC systems is the construction of the first supercomputer capable executing more than an exaflop, 10^18 floating point operations per second. On systems of this scale, failures will occur much more frequently than on current systems. As a result, resilience is a key obstacle to building next-generation extreme-scale systems. Coordinated checkpointing is currently the most widely-used mechanism for handling failures on HPC systems. Although coordinated checkpointing remains effective on current systems, increasing the scale of today\u27s systems to build next-generation systems will increase the cost of fault tolerance as more and more time is taken away from the application to protect against or recover from failure. Rollback avoidance techniques seek to mitigate the cost of checkpoint/restart by allowing an application to continue its execution rather than rolling back to an earlier checkpoint when failures occur. These techniques include failure prediction and preventive migration, replicated computation, fault-tolerant algorithms, and software-based memory fault correction. In this thesis, I examine how rollback avoidance techniques can be used to address failures on extreme-scale systems. Using a combination of analytic modeling and simulation, I evaluate the potential impact of rollback avoidance on these systems. I then present a novel rollback avoidance technique that exploits similarities in application memory. Finally, I examine the feasibility of using this technique to protect against memory faults in kernel memory
    corecore