11,661 research outputs found

    Variation Analysis, Fault Modeling and Yield Improvement of Emerging Spintronic Memories

    Get PDF

    Self-Healing Cellular Automata to Correct Soft Errors in Defective Embedded Program Memories

    Get PDF
    Static Random Access Memory (SRAM) cells in ultra-low power Integrated Circuits (ICs) based on nanoscale Complementary Metal Oxide Semiconductor (CMOS) devices are likely to be the most vulnerable to large-scale soft errors. Conventional error correction circuits may not be able to handle the distributed nature of such errors and are susceptible to soft errors themselves. In this thesis, a distributed error correction circuit called Self-Healing Cellular Automata (SHCA) that can repair itself is presented. A possible way to deploy a SHCA in a system of SRAM-based embedded program memories (ePM) for one type of chip multi-processors is also discussed. The SHCA is compared with conventional error correction approaches and its strengths and limitations are analyzed

    Dependable Embedded Systems

    Get PDF
    This Open Access book introduces readers to many new techniques for enhancing and optimizing reliability in embedded systems, which have emerged particularly within the last five years. This book introduces the most prominent reliability concerns from today’s points of view and roughly recapitulates the progress in the community so far. Unlike other books that focus on a single abstraction level such circuit level or system level alone, the focus of this book is to deal with the different reliability challenges across different levels starting from the physical level all the way to the system level (cross-layer approaches). The book aims at demonstrating how new hardware/software co-design solution can be proposed to ef-fectively mitigate reliability degradation such as transistor aging, processor variation, temperature effects, soft errors, etc. Provides readers with latest insights into novel, cross-layer methods and models with respect to dependability of embedded systems; Describes cross-layer approaches that can leverage reliability through techniques that are pro-actively designed with respect to techniques at other layers; Explains run-time adaptation and concepts/means of self-organization, in order to achieve error resiliency in complex, future many core systems

    Investigation into yield and reliability enhancement of TSV-based three-dimensional integration circuits

    No full text
    Three dimensional integrated circuits (3D ICs) have been acknowledged as a promising technology to overcome the interconnect delay bottleneck brought by continuous CMOS scaling. Recent research shows that through-silicon-vias (TSVs), which act as vertical links between layers, pose yield and reliability challenges for 3D design. This thesis presents three original contributions.The first contribution presents a grouping-based technique to improve the yield of 3D ICs under manufacturing TSV defects, where regular and redundant TSVs are partitioned into groups. In each group, signals can select good TSVs using rerouting multiplexers avoiding defective TSVs. Grouping ratio (regular to redundant TSVs in one group) has an impact on yield and hardware overhead. Mathematical probabilistic models are presented for yield analysis under the influence of independent and clustering defect distributions. Simulation results using MATLAB show that for a given number of TSVs and TSV failure rate, careful selection of grouping ratio results in achieving 100% yield at minimal hardware cost (number of multiplexers and redundant TSVs) in comparison to a design that does not exploit TSV grouping ratios. The second contribution presents an efficient online fault tolerance technique based on redundant TSVs, to detect TSV manufacturing defects and address thermal-induced reliability issue. The proposed technique accounts for both fault detection and recovery in the presence of three TSV defects: voids, delamination between TSV and landing pad, and TSV short-to-substrate. Simulations using HSPICE and ModelSim are carried out to validate fault detection and recovery. Results show that regular and redundant TSVs can be divided into groups to minimise area overhead without affecting the fault tolerance capability of the technique. Synthesis results using 130-nm design library show that 100% repair capability can be achieved with low area overhead (4% for the best case). The last contribution proposes a technique with joint consideration of temperature mitigation and fault tolerance without introducing additional redundant TSVs. This is achieved by reusing spare TSVs that are frequently deployed for improving yield and reliability in 3D ICs. The proposed technique consists of two steps: TSV determination step, which is for achieving optimal partition between regular and spare TSVs into groups; The second step is TSV placement, where temperature mitigation is targeted while optimizing total wirelength and routing difference. Simulation results show that using the proposed technique, 100% repair capability is achieved across all (five) benchmarks with an average temperature reduction of 75.2? (34.1%) (best case is 99.8? (58.5%)), while increasing wirelength by a small amount

    Wafer-scale integration of semiconductor memory.

    Get PDF
    This work is directed towards a study of full-slice or "wafer-scale integrated" - semiconductor memory. Previous approaches to full slice technology are studied and critically compared. It is shown that a fault-tolerant, fixed-interconnection approach offers many advantages; such a technique forms the basis of the experimental work. The disadvantages of the conventional technology are reviewed to illustrate the potential improvements in cost, packing density and reliability obtainable with wafer-scale integration (W.S.l). Iterative chip arrays are modelled by a pseudorandom fault distribution; algorithms to control the linking of adjacent good - chips into linear chains are proposed and investigated by computer simulation. It is demonstrated that long chains may be produced at practicable yield levels. The on-chip control circuitry and the external control electronics required to implement one particular algorithm are described in relation to a TTL simulation of an array of 4 X 4 integrated circuit chips. A layout of the on-chip control logic is shown to require (in 40 dynamic MOS circuitry) an area equivalent to ~250 shift register stages -a reasonable overhead on large memories. Structures are proposed to extend the fixed-interconnection, fault-tolerant concept to parallel/serial organised memory - covering RAM, ROM and Associative Memory applications requiring up to~ 2M bits of storage. Potential problem areas in implementing W.S.I are discussed and it is concluded that current technology is capable of manufacturing such devices. A detailed cost comparison of the conventional and W.S.I approaches to large serial memories illustrates the potential savings available with wafer-scale integration. The problem of gaining industrial acceptance for W.S.I is discussed in relation to known and anticipated views- of new technology. The thesis concludes with suggestions for further work in the general field of wafer-scale integration

    Infrastructures and Algorithms for Testable and Dependable Systems-on-a-Chip

    Get PDF
    Every new node of semiconductor technologies provides further miniaturization and higher performances, increasing the number of advanced functions that electronic products can offer. Silicon area is now so cheap that industries can integrate in a single chip usually referred to as System-on-Chip (SoC), all the components and functions that historically were placed on a hardware board. Although adding such advanced functionality can benefit users, the manufacturing process is becoming finer and denser, making chips more susceptible to defects. Today’s very deep-submicron semiconductor technologies (0.13 micron and below) have reached susceptibility levels that put conventional semiconductor manufacturing at an impasse. Being able to rapidly develop, manufacture, test, diagnose and verify such complex new chips and products is crucial for the continued success of our economy at-large. This trend is expected to continue at least for the next ten years making possible the design and production of 100 million transistor chips. To speed up the research, the National Technology Roadmap for Semiconductors identified in 1997 a number of major hurdles to be overcome. Some of these hurdles are related to test and dependability. Test is one of the most critical tasks in the semiconductor production process where Integrated Circuits (ICs) are tested several times starting from the wafer probing to the end of production test. Test is not only necessary to assure fault free devices but it also plays a key role in analyzing defects in the manufacturing process. This last point has high relevance since increasing time-to-market pressure on semiconductor fabrication often forces foundries to start volume production on a given semiconductor technology node before reaching the defect densities, and hence yield levels, traditionally obtained at that stage. The feedback derived from test is the only way to analyze and isolate many of the defects in today’s processes and to increase process’s yield. With the increasing need of high quality electronic products, at each new physical assembly level, such as board and system assembly, test is used for debugging, diagnosing and repairing the sub-assemblies in their new environment. Similarly, the increasing reliability, availability and serviceability requirements, lead the users of high-end products performing periodic tests in the field throughout the full life cycle. To allow advancements in each one of the above scaling trends, fundamental changes are expected to emerge in different Integrated Circuits (ICs) realization disciplines such as IC design, packaging and silicon process. These changes have a direct impact on test methods, tools and equipment. Conventional test equipment and methodologies will be inadequate to assure high quality levels. On chip specialized block dedicated to test, usually referred to as Infrastructure IP (Intellectual Property), need to be developed and included in the new complex designs to assure that new chips will be adequately tested, diagnosed, measured, debugged and even sometimes repaired. In this thesis, some of the scaling trends in designing new complex SoCs will be analyzed one at a time, observing their implications on test and identifying the key hurdles/challenges to be addressed. The goal of the remaining of the thesis is the presentation of possible solutions. It is not sufficient to address just one of the challenges; all must be met at the same time to fulfill the market requirements

    Towards Data Reliable, Low-Power, and Repairable Resistive Random Access Memories

    Get PDF
    A series of breakthroughs in memristive devices have demonstrated the potential of memristor arrays to serve as next generation resistive random access memories (ReRAM), which are fast, low-power, ultra-dense, and non-volatile. However, memristors' unique device characteristics also make them prone to several sources of error. Owing to the stochastic filamentary nature of memristive devices, various recoverable errors can affect the data reliability of a ReRAM. Permanent device failures further limit the lifetime of a ReRAM. This dissertation developed low-power solutions for more reliable and longer-enduring ReRAM systems. In this thesis, we first look into a data reliability issue known as write disturbance. Writing into a memristor in a crossbar could disturb the stored values in other memristors that are on the same memory line as the target cell. Such disturbance is accumulative over time which may lead to complete data corruption. To address this problem, we propose the use of two regular memristors on each word to keep track of the disturbance accumulation and trigger a refresh to restore the weakened data, once it becomes necessary. We also investigate the considerable variation in the write-time characteristics of individual memristors. With such variation, conventional fixed-pulse write schemes not only waste significant energy, but also cannot guarantee reliable completion of the write operations. We address such variation by proposing an adaptive write scheme that adjusts the width of the write pulses for each memristor. Our scheme embeds an online monitor to detect the completion of a write operation and takes into account the parasitic effect of line-shared devices in access-transistor-free memristive arrays. We further investigate the use of this method to shorten the test time of memory march algorithms by eliminating the need of a verifying read right after a write, which is commonly employed in the test sequences of march algorithms.Finally, we propose a novel mechanism to extend the lifetime of a ReRAM by protecting it against hard errors through the exploitation of a unique feature of bipolar memristive devices. Our solution proposes an unorthodox use of complementary resistive switches (a particular implementation of memristive devices) to provide an ``in-place spare'' for each memory cell at negligible extra cost. The in-place spares are then utilized by a repair scheme to repair memristive devices that have failed at a stuck-at-ON state at a page-level granularity. Furthermore, we explore the use of in-place spares in lieu of other memory reliability and yield enhancement solutions, such as error correction codes (ECC) and spare rows. We demonstrate that with the in-place spares, we can yield the same lifetime as a baseline ReRAM with either significantly fewer spare rows or a lighter-weight ECC, both of which can save on energy consumption and area
    corecore