3,273 research outputs found

    On-chip NBTI and Gate-Oxide-Degradation Sensing and Dynamic Management in VLSI Circuits.

    Full text link
    The VLSI industry has achieved advancement in technology by continuous process scaling which has resulted in large scale integration. However, scaling also poses new reliability challenges. Currently the industry ensures the reliability of chips by limiting the supply voltage and temperature, but these constraints limit the benefits that are obtained from new process nodes. This method of managing reliability during design time is called Static Reliability Management (SRM). While SRM ensures that all the chips meet the reliability specifications, it introduces extreme pessimism in the chips as it margins for worst process, voltage, temperature and circuit state (PVTS), which will not be required for the majority of chips. To reduce the pessimism of SRM, the system needs to be made aware of its reliability by employing degradation sensors or degradation detection techniques. Using the degradation measurements, the system can estimate its lifetime and can adjust its operating points (supply voltage and temperature limits) dynamically and trade excess reliability slack with performance. This method of reliability management is called Dynamic Reliability Management (DRM). In this work we investigate different methods of DRM. We focus on two critical degradation mechanisms: Negative Bias Temperature Instability (NBTI) and Gate-oxide degradation. We propose NBTI and Gate-oxide degradation sensors with low area and power overhead, which allows them to be deployed in large numbers on the chip enabling collection of degradation statistics. The sensors were designed in 130nm and 45nm process nodes and tested on two test-chips. We then used the sensors to perform DRM in a silicon test for the first time. We demonstrate that DRM eliminates excess reliability slack which allows for a boost in supply voltage and performance. We then propose in situ Bias Temperature Instability (BTI) and Gate-oxide wear-out detection techniques. The in situ technique measures the degradation in the actual devices in the core and removes all the layers of uncertainty which arise because of the statistical nature of degradation and its dependence on PVTS. We implemented and tested these techniques on two test chips in a 65nm process node. We then use the BTI sensing technique to perform DRM.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/86281/1/prsingh_1.pd

    An online wear state monitoring methodology for off-the-shelf embedded processors

    Full text link

    An online wear state monitoring methodology for off-the-shelf embedded processors

    Get PDF
    The continued scaling of transistors has led to an exponential increase in on-chip power density, which has resulted in increasing temperature. In turn, the increase in temperature directly leads to the increase in the rate of wear of a processor. Negative-bias temperature instability (NBTI) is one of the most dominant integrated circuit (IC) failure mechanisms [13, 5] that strongly depends on temperature. NBTI manifests in the form of increased circuit delays which can lead to timing failures and processor crashes. The ability to monitor the wear progression of a processor due to NBTI is valuable when designing real-time embedded systems. While NBTI can be detected using wear state sensors, not all chips are equipped with these sensors because detecting wear due to NBTI requires modifications to the chip design and incurs area and power overhead. NBTI sensor data may also not be exposed to users in software. In addition, wear sensors cannot take into account variations in wear due to the differences in the wear sensor devices and the other functional devices and their operating conditions. In this paper, we propose a lightweight, online methodology to monitor the wear process due to NBTI for off-the-shelf embedded processors. Our proposed method requires neither data on the threshold voltage and critical paths nor additional hardware. Our methodology can also be extended to predict the wear progression due to some other dominant IC failure mechanisms. Experiments on embedded processors provide insights on NBTI wear progression over time. This knowledge can be used to design real-time embedded systems that explicitly consider runtime wear progression to increase predictability and maintain lifetime reliability requirements

    Unreliable Silicon: Circuit through System-Level Techniques for Mitigating the Adverse Effects of Process Variation, Device Degradation and Environmental Conditions.

    Full text link
    Designing and manufacturing integrated circuits in advanced, highly-scaled processing technologies that meet stringent specification sets is an increasingly unreliable proposition. Dimensional processing variations, time and stress dependent device degradation and potentially varying environmental conditions exacerbate deviations in performance, power and even functionality of integrated circuits. This work explores a system-level adaptive design philosophy intended to mitigate the power and performance impact of unreliable silicon devices and presents enabling circuits for SRAM variation mitigation and in-situ measurement of device degradation in 130nm and 45nm processing technologies. An adaptation of RAZOR-based DVS designed for on-chip memory power reduction and reliability lifetime improvement enables the elimination of 250 mV of voltage margin in a 1.8V design, with up to 500 mV of reduction when allowing 5% of memory operations to use multiple cycles. A novel PID-controlled dynamic reliability management (DRM) system is presented, allowing user-specified circuit lifetime to be dynamically managed via dynamic voltage and frequency scaling. Peak performance improvement of 20-35% is achievable in typical processing systems by allowing brief periods of elevated voltage operation through the real-time DRM system, while minimizing voltage during non-critical periods of operation to maximize circuit lifetime. A probabilistic analysis of oxide breakdown using the percolation model indicates the need for 1000-2000 integrated in-situ sensors to achieve oxide lifetime prediction error at or under 10%. The conclusions from the oxide analysis are used to guide the design of a series of novel on-chip reliability monitoring circuits for use in a real-time DRM system. A 130nm in-situ oxide breakdown measurement sensor presented is the first published design of an oxide-breakdown oriented circuit and is compatible with standard-cell style automatic “place and route” design styles used in the majority of application specific integrated circuit designs. Measured results show increases in gate oxide leakage of 14-35% after accelerated stress testing. A second generation design of the on-chip oxide degradation sensor is presented that reduces stress mode power consumption by 111,785X over the initial design while providing an ideal 1:1 mapping of gate leakage to output frequency in extracted simulations.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/60701/1/ekarl_1.pd

    Low Cost NBTI Degradation Detection and Masking Approaches

    Get PDF
    Performance degradation of integrated circuits due to aging effects, such as Negative Bias Temperature Instability (NBTI), is becoming a great concern for current and future CMOS technology. In this paper, we propose two monitoring and masking approaches that detect late transitions due to NBTI degradation in the combinational part of critical data paths and guarantee the correctness of the provided output data by adapting the clock frequency. Compared to recently proposed alternative solutions, one of our approaches (denoted as Low Area and Power (LAP) approach) requires lower area overhead and lower, or comparable, power consumption, while exhibiting the same impact on system performance, while the other proposed approach (denoted as High Performance (HP) approach) allows us to reduce the impact on system performance, at the cost of some increase in area and power consumption
    • …
    corecore