137 research outputs found
Recommended from our members
Accelerating Electromigration Aging: Fast Failure Detection for Nanometer ICs
For practical testing and detection of electromigration (EM) induced failures in dual damascene copper interconnects, one critical issue is creating stressing conditions to induce the chip to fail exclusively under EM in a very short period of time so that EM sign-off and validation can be carried out efficiently. Existing acceleration techniques, which rely on increasing temperature and current densities beyond the known limits, also accelerate other reliability effects making it very difficult, if not impossible, to test EM in isolation. In this article, we propose novel EM wear-out acceleration techniques to address the aforementioned issue. First we show that multi-segment interconnects with reservoir and sink structures can be exploited to significantly speedup the EM wear-out process. Based on this observation, we propose three strategies to accelerate EM induced failure: reservoir-enhanced acceleration, sink-enhanced acceleration, and a hybrid method that combines both reservoir and sink structures. We then propose several configurable interconnect structures that exploit atomic reservoirs and sinks for accelerated EM testing. Such configurable interconnect structures are very flexible and can be used to achieve significant lifetime reductions at the cost of some routing resources. Using the proposed technique, EM testing can be carried out at nominal current densities, and at a much lower temperature compared to traditional testing methods. This is the most significant contribution of this work since, to our knowledge, this is the only method that allows EM testing to be performed in a controlled environment without the risk of invoking other reliability effects that are also accelerated by elevated temperature and current density. Simulation results show that, using the proposed method, we can reduce the EM lifetime of a chip from 10 years down to a few hours 10^5X acceleration under the 150C temperature limit, which is sufficient for practical EM testing of typical nanometer CMOS ICs
Recommended from our members
Simulation for Reliability, Hardware Security, and Ising Computing in VLSI Chip Design
The continued scaling of VLSI circuits has provided a wealth of opportunities andchallenges to the VLSI circuit design area. Both these challenges and opportunities, however,require new simulation tools that can enable their solution or exploitation as classicalmethods typically dealt with problem domains with smaller scales or less complexity. Inthis dissertation, simulation methods are presented to address the emerging VLSI designtopics of Electromigration induced aging and Ising computing and are then applied to theapplication areas of hardware security and graph partitioning respectively.The Electromigration aging effect in VLSI circuits is a long-term reliability issueaffecting current carrying metal wires leading to IR drop degradation. Typically, simpleanalytical equations can determine a wire’s effective age or if it will be affected by the EMaging effect at all. However, these classical methods are overly conservative and can lead toover design or unnecessary design iterations. Furthermore, it is expected that the EM agingeffect will become more severe in future Integrated Cirucits (ICs) due to increasing currentdensities and the prevalance of polycrystaline copper atom structures seen at small wiredimensions. For this reason, more comprehensive simulation techniques that can efficientlysimulate the EM effect with less conservative results can help mitigate overdesign andincrease design margins while reducing design iterations.The area of Hardware Security is becoming increasingly important as the chipsupply chain becomes more globalized and the integrity of chips becomes more diffiuclt toverify. Utilizing the accurate simulation techniques for EM, we can utilize this reliabilityeffect to demonstrate how a reliability based attack could be perpatrated. Furthermore, wecan utilize this aging effect as a defense mechanism to help us validate the integrity of anIC and detect counterfeit chips in the component supply chain market.Ising computing is an emerging method of solving combinatorial optimization problemsby simulating the interactions of so-called spin glasses and their interactions. Borrowingconcepts from quantum computing, this methods mimics the quantum interaction betweenspin glasses in such a way that finding a ground state of these spin glass models leadsto the solution of a particular problem. In this dissertation, effective methods of simulatingthe spin glass interactions using General Purpose Graphics Processing Units (GPGPUs)and finding their ground state are developed.In addition to the GPU based Ising model simulations, important combinatorialproblems can be mapped to the Ising model. In this dissertation the Ising solver is appliedto graph partitioning which can be utilized in VLSI design and many other domains as well.Specifically, solvers for the maxcut problem and the balanced min-cut partitioning problemare developed
Design of a reliability methodology: Modelling the influence of temperature on gate Oxide reliability
An Integrated Reliability Methodology (IRM) is presented that encompasses the changes that technology growth has brought with it and includes several new device degradation models. Each model is based on a physics of failure approach and includes on the effects of temperature. At all stages the models are verified experimentally on modern deep sub-micron devices. The research provides the foundations of a tool which gives the user the opportunity to make appropriate trade-offs between performance and reliability, and that can be implemented in the early stages of product development
Unreliable Silicon: Circuit through System-Level Techniques for Mitigating the Adverse Effects of Process Variation, Device Degradation and Environmental Conditions.
Designing and manufacturing integrated circuits in advanced, highly-scaled processing technologies that meet stringent specification sets is an increasingly unreliable proposition. Dimensional processing variations, time and stress dependent device degradation and potentially varying environmental conditions exacerbate deviations in performance, power and even functionality of integrated circuits. This work explores a system-level adaptive design philosophy intended to mitigate the power and performance impact of unreliable silicon devices and presents enabling circuits for SRAM variation mitigation and in-situ measurement of device degradation in 130nm and 45nm processing technologies. An adaptation of RAZOR-based DVS designed for on-chip memory power reduction and reliability lifetime improvement enables the elimination of 250 mV of voltage margin in a 1.8V design, with up to 500 mV of reduction when allowing 5% of memory operations to use multiple cycles. A novel PID-controlled dynamic reliability management (DRM) system is presented, allowing user-specified circuit lifetime to be dynamically managed via dynamic voltage and frequency scaling. Peak performance improvement of 20-35% is achievable in typical processing systems by allowing brief periods of elevated voltage operation through the real-time DRM system, while minimizing voltage during non-critical periods of operation to maximize circuit lifetime. A probabilistic analysis of oxide breakdown using the percolation model indicates the need for 1000-2000 integrated in-situ sensors to achieve oxide lifetime prediction error at or under 10%. The conclusions from the oxide analysis are used to guide the design of a series of novel on-chip reliability monitoring circuits for use in a real-time DRM system. A 130nm in-situ oxide breakdown measurement sensor presented is the first published design of an oxide-breakdown oriented circuit and is compatible with standard-cell style automatic “place and route” design styles used in the majority of application specific integrated circuit designs. Measured results show increases in gate oxide leakage of 14-35% after accelerated stress testing. A second generation design of the on-chip oxide degradation sensor is presented that reduces stress mode power consumption by 111,785X over the initial design while providing an ideal 1:1 mapping of gate leakage to output frequency in extracted simulations.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/60701/1/ekarl_1.pd
Recommended from our members
Physics-Based Electromigration Modeling and Analysis and Optimization
Long-term reliability is a major concern in modern VLSI design. Literature has shown that reliability gets worse as technology advances. It is expected that the future VLSI systems would have shorter reliability-induced lifetime comparing with previous generations. Being one of the most serious reliability effects, electromigration (EM) is a physical phenomenon of the migration of metal atoms due to the momentum exchange between atoms and the conducting electrons. It can cause wire resistance change or open circuit and result in functional failure of the circuit. Power-ground networks are the most vulnerable part to EM effect among all the interconnect wires since the current flow on this part is the largest on the chip. With new generation oftechnology node and aggressive design strategies, more accurate and efficient EM models are required. However, traditional EM approaches are very conservative and cannot meet current aggressive design strategies. Besides circuit level, EM also need to be thoroughly studied in system level due to limited power and temperature budgets among cores on chip. This research focuses on developing physical level EM model for VLSI circuits and system level EM optimization for multi-core systems in order to overcome the aforementioned problems. Specifically, for physical level, we develop two EM immortality check methods and a power grid EM check method. Firstly, a voltage based EM immortality analysis has been developed. Immortality condition in nucleation phase can be determined fast and accurately for multi-segment interconnect wires. Secondly, a saturation volume based incubation phase immortality check method has been proposed. This method can further reduce the redundancy in VLSI circuit design by immortality check in multiphase. Furthermore, both immortality check methods are integrated into a new power grid EM check methodology (EMspice) as filter for EM analysis. These filters can accelerate the simulation by filtering out immortal trees so that we only need to do simulation on fewer trees that are mortal. Coupled EM simulation considering both hydrostatic stress and electronic current/voltage in the power grid network will be applied to these mortal trees. This tool can work seamlessly with commercial synthesis flow. Besides physical level reliability models, system level reliability optimization is also discussed in this research. A deep reinforcement learning based EM optimization has been proposed for multi-core system. Both long term reliability effect (hard error) and transient soft error are considered. Energy can be optimized with all the reliability and other constraints fast and accurately compared to existing reliability management techniques. Last but not least, a scheduling based reliability optimization method for multi-core systems has been proposed. NBTI, HCI and EM are considered jointly. Lifetime of the system can be improved significantly compared to traditional methods which mainly focus on utilization
DEEP SUBMICRON CMOS VLSI CIRCUIT RELIABILITY MODELING, SIMULATION AND DESIGN
CMOS VLSI circuit reliability modeling and simulation have attracted intense research interest in the last two decades, and as a result almost all IC Design For Reliability (DFR) tools now try to incrementally simulate device wearout mechanisms in iterative ways. These DFR tools are capable of accurately characterizing the device wearout process and predicting its impact on circuit performance. Nevertheless, excessive simulation time and tedious parameter testing process often limit popularity of these tools in product design and fabrication.
This work develops a new SPICE reliability simulation method that shifts the focus of reliability analysis from device wearout to circuit functionality. A set of accelerated lifetime models and failure equivalent circuit models are proposed for the most common MOSFET intrinsic wearout mechanisms, including Hot Carrier Injection (HCI), Time Dependent Dielectric Breakdown (TDDB), and Negative Bias Temperature Instability (NBTI). The accelerated lifetime models help to identify the most degraded transistors in a circuit in terms of the device's terminal voltage and current waveforms. Then corresponding failure equivalent circuit models are incorporated into the circuit to substitute these identified transistors. Finally, SPICE simulation is performed again to check circuit functionality and analyze the impact of device wearout on circuit operation. Device wearout effects are lumped into a very limited number of failure equivalent circuit model parameters, and circuit performance degradation and functionality are determined by the magnitude of these parameters.
In this new method, it is unnecessary to perform a large number of small-step SPICE simulation iterations. Therefore, simulation time is obviously shortened in comparison to other tools. In addition, a reduced set of failure equivalent circuit model parameters, rather than a large number of device SPICE model parameters, need to be accurately characterized at each interim wearout process. Thus device testing and parameter extraction work are also significantly simplified. These advantages will allow circuit designers to perform quick and efficient circuit reliability analyses and to develop practical guidelines for reliable electronic designs
Investigation of Gallium Nitride Transistor Reliability through Accelerated Life Testing and Modeling
Gallium nitride high electron mobility transistors are attractive to the DoD for their ability to operate at high frequencies, voltages, temperatures, and power. Yet, there are concerns about the reliability of these devices. Various degradation mechanisms and their causes are proposed in the literature. A variety of reliability tests were conducted to understand these mechanisms and causes. A multi-stressor experiment revealed different failure mechanisms than are in the literature. In particular, the devices tested at high voltage in the OFF state did not degrade significantly as suggested by others\u27 reports. The validity of temperature-accelerated life testing when applied to GaN HEMT lifetime assessments is questioned. Temperature alone could not explain the differences in observed degradation. The tested devices showed excellent robustness to high forward gate stress, exhibiting only a slight change in gate diode characteristic, little decrease in maximum drain current, and a persisting breakdown voltage exceeding 200 V. The time-dependence of degradation was analyzed, and results of continuous- and pulsed-direct current stressing were compared
Dependable Embedded Systems
This Open Access book introduces readers to many new techniques for enhancing and optimizing reliability in embedded systems, which have emerged particularly within the last five years. This book introduces the most prominent reliability concerns from today’s points of view and roughly recapitulates the progress in the community so far. Unlike other books that focus on a single abstraction level such circuit level or system level alone, the focus of this book is to deal with the different reliability challenges across different levels starting from the physical level all the way to the system level (cross-layer approaches). The book aims at demonstrating how new hardware/software co-design solution can be proposed to ef-fectively mitigate reliability degradation such as transistor aging, processor variation, temperature effects, soft errors, etc. Provides readers with latest insights into novel, cross-layer methods and models with respect to dependability of embedded systems; Describes cross-layer approaches that can leverage reliability through techniques that are pro-actively designed with respect to techniques at other layers; Explains run-time adaptation and concepts/means of self-organization, in order to achieve error resiliency in complex, future many core systems
- …