61 research outputs found

    Cross-Layer Approaches for an Aging-Aware Design of Nanoscale Microprocessors

    Get PDF
    Thanks to aggressive scaling of transistor dimensions, computers have revolutionized our life. However, the increasing unreliability of devices fabricated in nanoscale technologies emerged as a major threat for the future success of computers. In particular, accelerated transistor aging is of great importance, as it reduces the lifetime of digital systems. This thesis addresses this challenge by proposing new methods to model, analyze and mitigate aging at microarchitecture-level and above

    A Holistic Solution for Reliability of 3D Parallel Systems

    Full text link
    As device scaling slows down, emerging technologies such as 3D integration and carbon nanotube field-effect transistors are among the most promising solutions to increase device density and performance. These emerging technologies offer shorter interconnects, higher performance, and lower power. However, higher levels of operating temperatures and current densities project significantly higher failure rates. Moreover, due to the infancy of the manufacturing process, high variation, and defect densities, chip designers are not encouraged to consider these emerging technologies as a stand-alone replacement for Silicon-based transistors. The goal of this dissertation is to introduce new architectural and circuit techniques that can work around high-fault rates in the emerging 3D technologies, improving performance and reliability comparable to Silicon. We propose a new holistic approach to the reliability problem that addresses the necessary aspects of an effective solution such as detection, diagnosis, repair, and prevention synergically for a practical solution. By leveraging 3D fabric layouts, it proposes the underlying architecture to efficiently repair the system in the presence of faults. This thesis presents a fault detection scheme by re-executing instructions on idle identical units that distinguishes between transient and permanent faults while localizing it to the granularity of a pipeline stage. Furthermore, with the use of a dynamic and adaptive reconfiguration policy based on activity factors and temperature variation, we propose a framework that delivers a significant improvement in lifetime management to prevent faults due to aging. Finally, a design framework that can be used for large-scale chip production while mitigating yield and variation failures to bring up Carbon Nano Tube-based technology is presented. The proposed framework is capable of efficiently supporting high-variation technologies by providing protection against manufacturing defects at different granularities: module and pipeline-stage levels.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/168118/1/javadb_1.pd

    Cross-Layer Resiliency Modeling and Optimization: A Device to Circuit Approach

    Get PDF
    The never ending demand for higher performance and lower power consumption pushes the VLSI industry to further scale the technology down. However, further downscaling of technology at nano-scale leads to major challenges. Reduced reliability is one of them, arising from multiple sources e.g. runtime variations, process variation, and transient errors. The objective of this thesis is to tackle unreliability with a cross layer approach from device up to circuit level

    Electrical overstress and electrostatic discharge failure in silicon MOS devices

    Get PDF
    This thesis presents an experimental and theoretical investigation of electrical failure in MOS structures, with a particular emphasis on short-pulse and ESD failure. It begins with an extensive survey of MOS technology, its failure mechanisms and protection schemes. A program of experimental research on MOS breakdown is then reported, the results of which are used to develop a model of breakdown across a wide spectrum of time scales. This model, in which bulk-oxide electron trapping/emission plays a major role, prohibits the direct use of causal theory over short time-scales, invalidating earlier theories on the subject. The work is extended to ESD stress of both polarities. Negative polarity ESD breakdownis found to be primarily oxide-voltage activated, with no significant dependence on temperature of luminosity. Positive polarity breakdown depends on the rate of surface inversion, dictated by the Si avalanche threshold and/or the generation speed of light-induced carriers. An analytical model, based upon the above theory is developed to predict ESD breakdown over a wide range of conditions. The thesis ends with an experimental and theoretical investigation of the effects of ESD breakdown on device and circuit performance. Breakdown sites are modelled as resistive paths in the oxide, and their distorting effects upon transistor performance are studied. The degradation of a damaged transistor under working stress is observed, giving a deeper insight into the latent hazards of ESD damage

    Aging-Aware Design Methods for Reliable Analog Integrated Circuits using Operating Point-Dependent Degradation

    Get PDF
    The focus of this thesis is on the development and implementation of aging-aware design methods, which are suitable to satisfy current needs of analog circuit design. Based on the well known \gm/\ID sizing methodology, an innovative tool-assisted aging-aware design approach is proposed, which is able to estimate shifts in circuit characteristics using mostly hand calculation schemes. The developed concept of an operating point-dependent degradation leads to the definition of an aging-aware sensitivity, which is compared to currently available degradation simulation flows and proves to be efficient in the estimation of circuit degradation. Using the aging-aware sensitivity, several analog circuits are investigated and optimized towards higher reliability. Finally, results are presented for numerous target specifications

    NEGATIVE BIAS TEMPERATURE INSTABILITY STUDIES FOR ANALOG SOC CIRCUITS

    Get PDF
    Negative Bias Temperature Instability (NBTI) is one of the recent reliability issues in sub threshold CMOS circuits. NBTI effect on analog circuits, which require matched device pairs and mismatches, will cause circuit failure. This work is to assess the NBTI effect considering the voltage and the temperature variations. It also provides a working knowledge of NBTI awareness to the circuit design community for reliable design of the SOC analog circuit. There have been numerous studies to date on the NBTI effect to analog circuits. However, other researchers did not study the implication of NBTI stress on analog circuits utilizing bandgap reference circuit. The reliability performance of all matched pair circuits, particularly the bandgap reference, is at the mercy of aging differential. Reliability simulation is mandatory to obtain realistic risk evaluation for circuit design reliability qualification. It is applicable to all circuit aging problems covering both analog and digital. Failure rate varies as a function of voltage and temperature. It is shown that PMOS is the reliabilitysusceptible device and NBTI is the most vital failure mechanism for analog circuit in sub-micrometer CMOS technology. This study provides a complete reliability simulation analysis of the on-die Thermal Sensor and the Digital Analog Converter (DAC) circuits and analyzes the effect of NBTI using reliability simulation tool. In order to check out the robustness of the NBTI-induced SOC circuit design, a bum-in experiment was conducted on the DAC circuits. The NBTI degradation observed in the reliability simulation analysis has given a clue that under a severe stress condition, a massive voltage threshold mismatch of beyond the 2mV limit was recorded. Bum-in experimental result on DAC proves the reliability sensitivity of NBTI to the DAC circuitry

    Cmos Rf Cituits Sic] Variability And Reliability Resilient Design, Modeling, And Simulation

    Get PDF
    The work presents a novel voltage biasing design that helps the CMOS RF circuits resilient to variability and reliability. The biasing scheme provides resilience through the threshold voltage (VT) adjustment, and at the mean time it does not degrade the PA performance. Analytical equations are established for sensitivity of the resilient biasing under various scenarios. Power Amplifier (PA) and Low Noise Amplifier (LNA) are investigated case by case through modeling and experiment. PTM 65nm technology is adopted in modeling the transistors within these RF blocks. A traditional class-AB PA with resilient design is compared the same PA without such design in PTM 65nm technology. Analytical equations are established for sensitivity of the resilient biasing under various scenarios. A traditional class-AB PA with resilient design is compared the same PA without such design in PTM 65nm technology. The results show that the biasing design helps improve the robustness of the PA in terms of linear gain, P1dB, Psat, and power added efficiency (PAE). Except for post-fabrication calibration capability, the design reduces the majority performance sensitivity of PA by 50% when subjected to threshold voltage (VT) shift and 25% to electron mobility (ÎĽn) degradation. The impact of degradation mismatches is also investigated. It is observed that the accelerated aging of MOS transistor in the biasing circuit will further reduce the sensitivity of PA. In the study of LNA, a 24 GHz narrow band cascade LNA with adaptive biasing scheme under various aging rate is compared to LNA without such biasing scheme. The modeling and simulation results show that the adaptive substrate biasing reduces the sensitivity of noise figure and minimum noise figure subject to process variation and iii device aging such as threshold voltage shift and electron mobility degradation. Simulation of different aging rate also shows that the sensitivity of LNA is further reduced with the accelerated aging of the biasing circuit. Thus, for majority RF transceiver circuits, the adaptive body biasing scheme provides overall performance resilience to the device reliability induced degradation. Also the tuning ability designed in RF PA and LNA provides the circuit post-process calibration capability

    Reliable Software for Unreliable Hardware - A Cross-Layer Approach

    Get PDF
    A novel cross-layer reliability analysis, modeling, and optimization approach is proposed in this thesis that leverages multiple layers in the system design abstraction (i.e. hardware, compiler, system software, and application program) to exploit the available reliability enhancing potential at each system layer and to exchange this information across multiple system layers

    Design of complex integrated systems based on networks-on-chip: Trading off performance, power and reliability

    Get PDF
    The steady advancement of microelectronics is associated with an escalating number of challenges for design engineers due to both the tiny dimensions and the enormous complexity of integrated systems. Against this background, this work deals with Network-On-Chip (NOC) as the emerging design paradigm to cope with diverse issues of nanotechnology. The detailed investigations within the chapters focus on the communication-centric aspects of multi-core-systems, whereas performance, power consumption as well as reliability are considered likewise as the essential design criteria

    Reliability-aware and energy-efficient system level design for networks-on-chip

    Get PDF
    2015 Spring.Includes bibliographical references.With CMOS technology aggressively scaling into the ultra-deep sub-micron (UDSM) regime and application complexity growing rapidly in recent years, processors today are being driven to integrate multiple cores on a chip. Such chip multiprocessor (CMP) architectures offer unprecedented levels of computing performance for highly parallel emerging applications in the era of digital convergence. However, a major challenge facing the designers of these emerging multicore architectures is the increased likelihood of failure due to the rise in transient, permanent, and intermittent faults caused by a variety of factors that are becoming more and more prevalent with technology scaling. On-chip interconnect architectures are particularly susceptible to faults that can corrupt transmitted data or prevent it from reaching its destination. Reliability concerns in UDSM nodes have in part contributed to the shift from traditional bus-based communication fabrics to network-on-chip (NoC) architectures that provide better scalability, performance, and utilization than buses. In this thesis, to overcome potential faults in NoCs, my research began by exploring fault-tolerant routing algorithms. Under the constraint of deadlock freedom, we make use of the inherent redundancy in NoCs due to multiple paths between packet sources and sinks and propose different fault-tolerant routing schemes to achieve much better fault tolerance capabilities than possible with traditional routing schemes. The proposed schemes also use replication opportunistically to optimize the balance between energy overhead and arrival rate. As 3D integrated circuit (3D-IC) technology with wafer-to-wafer bonding has been recently proposed as a promising candidate for future CMPs, we also propose a fault-tolerant routing scheme for 3D NoCs which outperforms the existing popular routing schemes in terms of energy consumption, performance and reliability. To quantify reliability and provide different levels of intelligent protection, for the first time, we propose the network vulnerability factor (NVF) metric to characterize the vulnerability of NoC components to faults. NVF determines the probabilities that faults in NoC components manifest as errors in the final program output of the CMP system. With NVF aware partial protection for NoC components, almost 50% energy cost can be saved compared to the traditional approach of comprehensively protecting all NoC components. Lastly, we focus on the problem of fault-tolerant NoC design, that involves many NP-hard sub-problems such as core mapping, fault-tolerant routing, and fault-tolerant router configuration. We propose a novel design-time (RESYN) and a hybrid design and runtime (HEFT) synthesis framework to trade-off energy consumption and reliability in the NoC fabric at the system level for CMPs. Together, our research in fault-tolerant NoC routing, reliability modeling, and reliability aware NoC synthesis substantially enhances NoC reliability and energy-efficiency beyond what is possible with traditional approaches and state-of-the-art strategies from prior work
    • …
    corecore