36 research outputs found

    Cross-Layer Approaches for an Aging-Aware Design of Nanoscale Microprocessors

    Get PDF
    Thanks to aggressive scaling of transistor dimensions, computers have revolutionized our life. However, the increasing unreliability of devices fabricated in nanoscale technologies emerged as a major threat for the future success of computers. In particular, accelerated transistor aging is of great importance, as it reduces the lifetime of digital systems. This thesis addresses this challenge by proposing new methods to model, analyze and mitigate aging at microarchitecture-level and above

    A Study of Nanometer Semiconductor Scaling Effects on Microelectronics Reliability

    Get PDF
    The desire to assess the reliability of emerging scaled microelectronics technologies through faster reliability trials and more accurate acceleration models is the precursor for further research and experimentation in this relevant field. The effect of semiconductor scaling on microelectronics product reliability is an important aspect to the high reliability application user. From the perspective of a customer or user, who in many cases must deal with very limited, if any, manufacturer's reliability data to assess the product for a highly-reliable application, product-level testing is critical in the characterization and reliability assessment of advanced nanometer semiconductor scaling effects on microelectronics reliability. This dissertation provides a methodology on how to accomplish this and provides techniques for deriving the expected product-level reliability on commercial memory products. Competing mechanism theory and the multiple failure mechanism model are applied to two separate experiments; scaled SRAM and SDRAM products. Accelerated stress testing at multiple conditions is applied at the product level of several scaled memory products to assess the performance degradation and product reliability. Acceleration models are derived for each case. For several scaled SDRAM products, retention time degradation is studied and two distinct soft error populations are observed with each technology generation: early breakdown, characterized by randomly distributed weak bits with Weibull slope Beta=1, and a main population breakdown with an increasing failure rate. Retention time soft error rates are calculated and a multiple failure mechanism acceleration model with parameters is derived for each technology. Defect densities are calculated and reflect a decreasing trend in the percentage of random defective bits for each successive product generation. A normalized soft error failure rate of the memory data retention time in FIT/Gb and FIT/cm2 for several scaled SDRAM generations is presented revealing a power relationship. General models describing the soft error rates across scaled product generations are presented. The analysis methodology may be applied to other scaled microelectronic products and key parameters

    NEGATIVE BIAS TEMPERATURE INSTABILITY STUDIES FOR ANALOG SOC CIRCUITS

    Get PDF
    Negative Bias Temperature Instability (NBTI) is one of the recent reliability issues in sub threshold CMOS circuits. NBTI effect on analog circuits, which require matched device pairs and mismatches, will cause circuit failure. This work is to assess the NBTI effect considering the voltage and the temperature variations. It also provides a working knowledge of NBTI awareness to the circuit design community for reliable design of the SOC analog circuit. There have been numerous studies to date on the NBTI effect to analog circuits. However, other researchers did not study the implication of NBTI stress on analog circuits utilizing bandgap reference circuit. The reliability performance of all matched pair circuits, particularly the bandgap reference, is at the mercy of aging differential. Reliability simulation is mandatory to obtain realistic risk evaluation for circuit design reliability qualification. It is applicable to all circuit aging problems covering both analog and digital. Failure rate varies as a function of voltage and temperature. It is shown that PMOS is the reliabilitysusceptible device and NBTI is the most vital failure mechanism for analog circuit in sub-micrometer CMOS technology. This study provides a complete reliability simulation analysis of the on-die Thermal Sensor and the Digital Analog Converter (DAC) circuits and analyzes the effect of NBTI using reliability simulation tool. In order to check out the robustness of the NBTI-induced SOC circuit design, a bum-in experiment was conducted on the DAC circuits. The NBTI degradation observed in the reliability simulation analysis has given a clue that under a severe stress condition, a massive voltage threshold mismatch of beyond the 2mV limit was recorded. Bum-in experimental result on DAC proves the reliability sensitivity of NBTI to the DAC circuitry

    A Holistic Solution for Reliability of 3D Parallel Systems

    Full text link
    As device scaling slows down, emerging technologies such as 3D integration and carbon nanotube field-effect transistors are among the most promising solutions to increase device density and performance. These emerging technologies offer shorter interconnects, higher performance, and lower power. However, higher levels of operating temperatures and current densities project significantly higher failure rates. Moreover, due to the infancy of the manufacturing process, high variation, and defect densities, chip designers are not encouraged to consider these emerging technologies as a stand-alone replacement for Silicon-based transistors. The goal of this dissertation is to introduce new architectural and circuit techniques that can work around high-fault rates in the emerging 3D technologies, improving performance and reliability comparable to Silicon. We propose a new holistic approach to the reliability problem that addresses the necessary aspects of an effective solution such as detection, diagnosis, repair, and prevention synergically for a practical solution. By leveraging 3D fabric layouts, it proposes the underlying architecture to efficiently repair the system in the presence of faults. This thesis presents a fault detection scheme by re-executing instructions on idle identical units that distinguishes between transient and permanent faults while localizing it to the granularity of a pipeline stage. Furthermore, with the use of a dynamic and adaptive reconfiguration policy based on activity factors and temperature variation, we propose a framework that delivers a significant improvement in lifetime management to prevent faults due to aging. Finally, a design framework that can be used for large-scale chip production while mitigating yield and variation failures to bring up Carbon Nano Tube-based technology is presented. The proposed framework is capable of efficiently supporting high-variation technologies by providing protection against manufacturing defects at different granularities: module and pipeline-stage levels.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/168118/1/javadb_1.pd

    inSense: A Variation and Fault Tolerant Architecture for Nanoscale Devices

    Get PDF
    Transistor technology scaling has been the driving force in improving the size, speed, and power consumption of digital systems. As devices approach atomic size, however, their reliability and performance are increasingly compromised due to reduced noise margins, difficulties in fabrication, and emergent nano-scale phenomena. Scaled CMOS devices, in particular, suffer from process variations such as random dopant fluctuation (RDF) and line edge roughness (LER), transistor degradation mechanisms such as negative-bias temperature instability (NBTI) and hot-carrier injection (HCI), and increased sensitivity to single event upsets (SEUs). Consequently, future devices may exhibit reduced performance, diminished lifetimes, and poor reliability. This research proposes a variation and fault tolerant architecture, the inSense architecture, as a circuit-level solution to the problems induced by the aforementioned phenomena. The inSense architecture entails augmenting circuits with introspective and sensory capabilities which are able to dynamically detect and compensate for process variations, transistor degradation, and soft errors. This approach creates ``smart\u27\u27 circuits able to function despite the use of unreliable devices and is applicable to current CMOS technology as well as next-generation devices using new materials and structures. Furthermore, this work presents an automated prototype implementation of the inSense architecture targeted to CMOS devices and is evaluated via implementation in ISCAS \u2785 benchmark circuits. The automated prototype implementation is functionally verified and characterized: it is found that error detection capability (with error windows from \approx30-400ps) can be added for less than 2\% area overhead for circuits of non-trivial complexity. Single event transient (SET) detection capability (configurable with target set-points) is found to be functional, although it generally tracks the standard DMR implementation with respect to overheads

    Unreliable Silicon: Circuit through System-Level Techniques for Mitigating the Adverse Effects of Process Variation, Device Degradation and Environmental Conditions.

    Full text link
    Designing and manufacturing integrated circuits in advanced, highly-scaled processing technologies that meet stringent specification sets is an increasingly unreliable proposition. Dimensional processing variations, time and stress dependent device degradation and potentially varying environmental conditions exacerbate deviations in performance, power and even functionality of integrated circuits. This work explores a system-level adaptive design philosophy intended to mitigate the power and performance impact of unreliable silicon devices and presents enabling circuits for SRAM variation mitigation and in-situ measurement of device degradation in 130nm and 45nm processing technologies. An adaptation of RAZOR-based DVS designed for on-chip memory power reduction and reliability lifetime improvement enables the elimination of 250 mV of voltage margin in a 1.8V design, with up to 500 mV of reduction when allowing 5% of memory operations to use multiple cycles. A novel PID-controlled dynamic reliability management (DRM) system is presented, allowing user-specified circuit lifetime to be dynamically managed via dynamic voltage and frequency scaling. Peak performance improvement of 20-35% is achievable in typical processing systems by allowing brief periods of elevated voltage operation through the real-time DRM system, while minimizing voltage during non-critical periods of operation to maximize circuit lifetime. A probabilistic analysis of oxide breakdown using the percolation model indicates the need for 1000-2000 integrated in-situ sensors to achieve oxide lifetime prediction error at or under 10%. The conclusions from the oxide analysis are used to guide the design of a series of novel on-chip reliability monitoring circuits for use in a real-time DRM system. A 130nm in-situ oxide breakdown measurement sensor presented is the first published design of an oxide-breakdown oriented circuit and is compatible with standard-cell style automatic “place and route” design styles used in the majority of application specific integrated circuit designs. Measured results show increases in gate oxide leakage of 14-35% after accelerated stress testing. A second generation design of the on-chip oxide degradation sensor is presented that reduces stress mode power consumption by 111,785X over the initial design while providing an ideal 1:1 mapping of gate leakage to output frequency in extracted simulations.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/60701/1/ekarl_1.pd

    EXPERIMENTAL STUDY OF BIAS TEMPERATURE INSTABILITY AND PROGRESSIVE BREAKDOWN OF ADVANCED GATE DIELECTRICS

    Get PDF
    With shrinking gate dielectrics, the reliability requirements of semiconductor gate dielectrics become more and more difficult to maintain. New physical mechanisms and phenomena are discovered and new challenges arise. At the same time, some issues, which have been minor in the past, begin to show bigger impact, such as the Negative Bias Temperature Instability issue. The dynamic NBTI phenomenon was studied with ultrathin SiO2 and HfO2 devices. With a dynamic stress condition, the device lifetime can be largely extended due to the reduced NBTI degradation. This reduction is contributed to the annealing of fixed oxide charges during the stress off period. A mathematical model is also established to explain this phenomenon. With alternative gate dielectrics' introduction, new issues associated with these materials and device structures are also raised. Those issues need to be studied in detail before fully incorporation of new materials. Compared with SiO2 devices, the NBTI degradation of HfO2 has a similar trend. However, it is found that they have different frequency response than the SiO2 devices. This difference is later found due to the traps inside the gate dielectrics. Detailed studies show that NBTI degradations at dc stress and dynamic stress conditions have different temperature acceleration factors due to the bulk traps. The disappearance of this difference by insetting a detrapping period further proves this observation. As we enter the ultrathin gate dielectrics regime, the electron tunneling mechanisms behind the gate dielectrics breakdown shift. Consequently, gate dielectrics breakdown mode also shifts from the clear-detected hard breakdown to the noisy soft breakdown. Thus new lifetime extrapolation models are needed. The progressive breakdown of ultrathin SiO2 is studied by a two-step test methodology. By monitoring the degradation of the progressive breakdown path in terms of the activation energy, the voltage acceleration factor, two kinds of breakdown filaments, the stable one and the unstable one, were studied. The stable filament is found to be a breakdown filament independent of the original breakdown filament, and the unstable filament is the continuing degradation of the original filament

    Secure HfO2 based charge trap EEPROM with lifetime and data retention time modeling

    Get PDF
    Trusted computing is currently the most promising security strategy for cyber physical systems. Trusted computing platform relies on securely stored encryption keys in the on-board memory. However, research and actual cases have shown the vulnerability of the on-board memory to physical cryptographic attacks. This work proposed an embedded secure EEPROM architecture employing charge trap transistor to improve the security of storage means in the trusted computing platform. The charge trap transistor is CMOS compatible with high dielectric constant material as gate oxide which can trap carriers. The process compatibility allows the secure information containing memory to be embedded with the CPU. This eliminates the eavesdropping and optical observation. This effort presents the secure EEPROM cell, its high voltage programming control structure and an interface architecture for command and data communication between the EEPROM and CPU. The interface architecture is an ASIC based design that exclusively for the secure EEPROM. The on-board programming capability enables adjustment of programming voltages and accommodates EEPROM threshold variation due to PVT to optimize lifetime. In addition to the functional circuitry, this work presents the first model of lifetime and data retention time tradeoff for this new type of EEPROM. This model builds the bridge between desired data retention time and lifetime while producing the corresponding programming time and voltage

    Techniques for Aging, Soft Errors and Temperature to Increase the Reliability of Embedded On-Chip Systems

    Get PDF
    This thesis investigates the challenge of providing an abstracted, yet sufficiently accurate reliability estimation for embedded on-chip systems. In addition, it also proposes new techniques to increase the reliability of register files within processors against aging effects and soft errors. It also introduces a novel thermal measurement setup that perspicuously captures the infrared images of modern multi-core processors

    Design of complex integrated systems based on networks-on-chip: Trading off performance, power and reliability

    Get PDF
    The steady advancement of microelectronics is associated with an escalating number of challenges for design engineers due to both the tiny dimensions and the enormous complexity of integrated systems. Against this background, this work deals with Network-On-Chip (NOC) as the emerging design paradigm to cope with diverse issues of nanotechnology. The detailed investigations within the chapters focus on the communication-centric aspects of multi-core-systems, whereas performance, power consumption as well as reliability are considered likewise as the essential design criteria
    corecore