13 research outputs found
Computing system reliability modeling, analysis, and optimization
Ph.DDOCTOR OF PHILOSOPH
Recommended from our members
Accelerating Electromigration Aging: Fast Failure Detection for Nanometer ICs
For practical testing and detection of electromigration (EM) induced failures in dual damascene copper interconnects, one critical issue is creating stressing conditions to induce the chip to fail exclusively under EM in a very short period of time so that EM sign-off and validation can be carried out efficiently. Existing acceleration techniques, which rely on increasing temperature and current densities beyond the known limits, also accelerate other reliability effects making it very difficult, if not impossible, to test EM in isolation. In this article, we propose novel EM wear-out acceleration techniques to address the aforementioned issue. First we show that multi-segment interconnects with reservoir and sink structures can be exploited to significantly speedup the EM wear-out process. Based on this observation, we propose three strategies to accelerate EM induced failure: reservoir-enhanced acceleration, sink-enhanced acceleration, and a hybrid method that combines both reservoir and sink structures. We then propose several configurable interconnect structures that exploit atomic reservoirs and sinks for accelerated EM testing. Such configurable interconnect structures are very flexible and can be used to achieve significant lifetime reductions at the cost of some routing resources. Using the proposed technique, EM testing can be carried out at nominal current densities, and at a much lower temperature compared to traditional testing methods. This is the most significant contribution of this work since, to our knowledge, this is the only method that allows EM testing to be performed in a controlled environment without the risk of invoking other reliability effects that are also accelerated by elevated temperature and current density. Simulation results show that, using the proposed method, we can reduce the EM lifetime of a chip from 10 years down to a few hours 10^5X acceleration under the 150C temperature limit, which is sufficient for practical EM testing of typical nanometer CMOS ICs
Recommended from our members
Simulation for Reliability, Hardware Security, and Ising Computing in VLSI Chip Design
The continued scaling of VLSI circuits has provided a wealth of opportunities andchallenges to the VLSI circuit design area. Both these challenges and opportunities, however,require new simulation tools that can enable their solution or exploitation as classicalmethods typically dealt with problem domains with smaller scales or less complexity. Inthis dissertation, simulation methods are presented to address the emerging VLSI designtopics of Electromigration induced aging and Ising computing and are then applied to theapplication areas of hardware security and graph partitioning respectively.The Electromigration aging effect in VLSI circuits is a long-term reliability issueaffecting current carrying metal wires leading to IR drop degradation. Typically, simpleanalytical equations can determine a wire’s effective age or if it will be affected by the EMaging effect at all. However, these classical methods are overly conservative and can lead toover design or unnecessary design iterations. Furthermore, it is expected that the EM agingeffect will become more severe in future Integrated Cirucits (ICs) due to increasing currentdensities and the prevalance of polycrystaline copper atom structures seen at small wiredimensions. For this reason, more comprehensive simulation techniques that can efficientlysimulate the EM effect with less conservative results can help mitigate overdesign andincrease design margins while reducing design iterations.The area of Hardware Security is becoming increasingly important as the chipsupply chain becomes more globalized and the integrity of chips becomes more diffiuclt toverify. Utilizing the accurate simulation techniques for EM, we can utilize this reliabilityeffect to demonstrate how a reliability based attack could be perpatrated. Furthermore, wecan utilize this aging effect as a defense mechanism to help us validate the integrity of anIC and detect counterfeit chips in the component supply chain market.Ising computing is an emerging method of solving combinatorial optimization problemsby simulating the interactions of so-called spin glasses and their interactions. Borrowingconcepts from quantum computing, this methods mimics the quantum interaction betweenspin glasses in such a way that finding a ground state of these spin glass models leadsto the solution of a particular problem. In this dissertation, effective methods of simulatingthe spin glass interactions using General Purpose Graphics Processing Units (GPGPUs)and finding their ground state are developed.In addition to the GPU based Ising model simulations, important combinatorialproblems can be mapped to the Ising model. In this dissertation the Ising solver is appliedto graph partitioning which can be utilized in VLSI design and many other domains as well.Specifically, solvers for the maxcut problem and the balanced min-cut partitioning problemare developed
Recommended from our members
Physics-Based Electromigration Modeling and Analysis and Optimization
Long-term reliability is a major concern in modern VLSI design. Literature has shown that reliability gets worse as technology advances. It is expected that the future VLSI systems would have shorter reliability-induced lifetime comparing with previous generations. Being one of the most serious reliability effects, electromigration (EM) is a physical phenomenon of the migration of metal atoms due to the momentum exchange between atoms and the conducting electrons. It can cause wire resistance change or open circuit and result in functional failure of the circuit. Power-ground networks are the most vulnerable part to EM effect among all the interconnect wires since the current flow on this part is the largest on the chip. With new generation oftechnology node and aggressive design strategies, more accurate and efficient EM models are required. However, traditional EM approaches are very conservative and cannot meet current aggressive design strategies. Besides circuit level, EM also need to be thoroughly studied in system level due to limited power and temperature budgets among cores on chip. This research focuses on developing physical level EM model for VLSI circuits and system level EM optimization for multi-core systems in order to overcome the aforementioned problems. Specifically, for physical level, we develop two EM immortality check methods and a power grid EM check method. Firstly, a voltage based EM immortality analysis has been developed. Immortality condition in nucleation phase can be determined fast and accurately for multi-segment interconnect wires. Secondly, a saturation volume based incubation phase immortality check method has been proposed. This method can further reduce the redundancy in VLSI circuit design by immortality check in multiphase. Furthermore, both immortality check methods are integrated into a new power grid EM check methodology (EMspice) as filter for EM analysis. These filters can accelerate the simulation by filtering out immortal trees so that we only need to do simulation on fewer trees that are mortal. Coupled EM simulation considering both hydrostatic stress and electronic current/voltage in the power grid network will be applied to these mortal trees. This tool can work seamlessly with commercial synthesis flow. Besides physical level reliability models, system level reliability optimization is also discussed in this research. A deep reinforcement learning based EM optimization has been proposed for multi-core system. Both long term reliability effect (hard error) and transient soft error are considered. Energy can be optimized with all the reliability and other constraints fast and accurately compared to existing reliability management techniques. Last but not least, a scheduling based reliability optimization method for multi-core systems has been proposed. NBTI, HCI and EM are considered jointly. Lifetime of the system can be improved significantly compared to traditional methods which mainly focus on utilization
RESEARCH OF EXPLOITABLE RELIABILITY AND MAINTAINABILITY OF HIGH-SPEED MARINE ENGINES
U doktorskoj disertaciji su prezentirani rezultati istraživanja eksploatacijske pouzdanosti i
pogodnosti održavanja (popravljivosti) brzookretnih radijalnih brodskih dizelskih motora.
Za primjenu opće teorije pouzdanosti na motore tipa „Zvezda“ M 504 B2, koji čine najvažniji
dio propulzijskog podsustava raketnih topovnjača HRM (RTOP 11 i RTOP 12), učinjena je
statistička obrada i znanstvena analiza velikog broja podataka uzetih iz prakse, a sve u cilju
poboljšanja i unapređenja ukupne pouzdanosti cijele flote.
U analizi je predložen model pouzdanosti raketne topovnjače gdje se vide svi njezini
podsustavi, a praksa je pokazala da propulzijski podsustav najviše utječe na ukupnu pouzdanost
tih brodova, pa time i na ratnu spremnost. Stoga je kao predmet istraživanja odabran glavni
motor kojeg je trebalo raščlaniti na podsustave, a sve na osnovi eksploatacijskih podataka i u
cilju utvrđivanja pouzdanosti svakog pojedinog podsustava.
Cilj istraživanja je bilo utvrditi funkciju pouzdanosti, funkciju intenziteta kvara i funkciju
popravljivosti motora u zahtijevanim operativnim uvjetima Jadranskog akvatorija, te dobivene
rezultate usporediti s istraživanjima drugih autora u svrhu sagledavanja trendova i znanstvene
relevantnosti.
Znanstveno je dokazano kako Weibullova distribucija najbolje aproksimira karakteristiku
eksploatacijske pouzdanosti, a očekivano vrijeme rada bez kvara može se dobro aproksimirati
srednjim vremenom između kvara.
Jednako tako utvrđeno je kako Normalna distribucija najbolje aproksimira empirijsku funkciju
pogodnosti održavanja odnosno popravljivosti, a srednja vrijednost vremena održavanja
odgovara srednjem vremenu popravka motora po podsustavima.
Rezultati istraživanja se mogu primijeniti i na druge slične brodske motore, ali i na dizelske
motore drugih platformi. U disertaciji su povezani rezultati istraživanja i znanstveno priopćenje
na način da se povezuju teorijske postavke iz znanstvene literature i neposredna praktična
primjena u cilju povećanja operativne raspoloživosti. Neposredna korisnost očituje se u
predviđanju kvarova vitalnih podsustava i dijelova dizelskih motora te alociranju potrebnih
remontnih kapaciteta, financijskih potreba i nabave pričuvnih dijelova, kao i u povećanju
kvalitete obuke i osposobljavanja kadrova u eksploataciji i održavanju. Naglašena je i važnost
kvalitetnog vođenja dokumentacije u cilju poboljšanja predloženog modela.This doctoral dissertation provides results of a research on operational reliability and
maintainability of a high speed radial diesel engine. To apply the general theory of reliability on
engines type ‘’Zvezda’’ M 504 B2, which are the most important part of the propulsion
subsystem of missile boats (RTOP 11 and RTOP 12), statistical analysis was performed and
scientific analysis of a large number of data taken from practice and all with the aim of
improving overall reliability of the whole Fleet.
In the analysis proposed model of reliability of missile boats is one where it can be seen all of
its subsystem, but practice has shown that the propulsion subsystem has the greatest impact on
the overall reliability of these ships, and therefore on the war preparedness. Therefore, as a
research subject selected is a main engine which had to be broken down in subsystem and all
based on exploitation data and all in order to ascertain the reliability of each individual
subsystem.
The aim of this research was to determine a reliability function, failure intensity function as well
as maintainability function of the engine in demanding operational conditions in the Adriatic
sea and to compare the results with those obtained by different researches in order to perceive
trends and their scientific relevance.
It has been concluded that the Weibull distribution approximates the best the operational
reliability and the expected trouble-free period of operation can be well approximated by the
mean time between failures.
It has also been found out that the normal distribution best approximates the empirical function
of maintainability or engine reparability and the mean maintenance time complies with the
mean engine repair time by subsystems.
The research results can be applied to other similar boat engines as well as to diesel engines of
different platforms. This doctoral dissertation connects the research results with scientific
reports in the way that theoretical hypotheses from the literature are related to direct
applications in order to improve the operational availability.
Immediate usefulness is evident in predicting failures in vital subsystems and of different diesel
engine parts and in the allocation of necessary ship repair facilities, financial needs and spare
parts procurement as well as in improving the quality of training and qualification. Emphasize
the importance of high-quality document management in order to improve the proposed model
RESEARCH OF EXPLOITABLE RELIABILITY AND MAINTAINABILITY OF HIGH-SPEED MARINE ENGINES
U doktorskoj disertaciji su prezentirani rezultati istraživanja eksploatacijske pouzdanosti i
pogodnosti održavanja (popravljivosti) brzookretnih radijalnih brodskih dizelskih motora.
Za primjenu opće teorije pouzdanosti na motore tipa „Zvezda“ M 504 B2, koji čine najvažniji
dio propulzijskog podsustava raketnih topovnjača HRM (RTOP 11 i RTOP 12), učinjena je
statistička obrada i znanstvena analiza velikog broja podataka uzetih iz prakse, a sve u cilju
poboljšanja i unapređenja ukupne pouzdanosti cijele flote.
U analizi je predložen model pouzdanosti raketne topovnjače gdje se vide svi njezini
podsustavi, a praksa je pokazala da propulzijski podsustav najviše utječe na ukupnu pouzdanost
tih brodova, pa time i na ratnu spremnost. Stoga je kao predmet istraživanja odabran glavni
motor kojeg je trebalo raščlaniti na podsustave, a sve na osnovi eksploatacijskih podataka i u
cilju utvrđivanja pouzdanosti svakog pojedinog podsustava.
Cilj istraživanja je bilo utvrditi funkciju pouzdanosti, funkciju intenziteta kvara i funkciju
popravljivosti motora u zahtijevanim operativnim uvjetima Jadranskog akvatorija, te dobivene
rezultate usporediti s istraživanjima drugih autora u svrhu sagledavanja trendova i znanstvene
relevantnosti.
Znanstveno je dokazano kako Weibullova distribucija najbolje aproksimira karakteristiku
eksploatacijske pouzdanosti, a očekivano vrijeme rada bez kvara može se dobro aproksimirati
srednjim vremenom između kvara.
Jednako tako utvrđeno je kako Normalna distribucija najbolje aproksimira empirijsku funkciju
pogodnosti održavanja odnosno popravljivosti, a srednja vrijednost vremena održavanja
odgovara srednjem vremenu popravka motora po podsustavima.
Rezultati istraživanja se mogu primijeniti i na druge slične brodske motore, ali i na dizelske
motore drugih platformi. U disertaciji su povezani rezultati istraživanja i znanstveno priopćenje
na način da se povezuju teorijske postavke iz znanstvene literature i neposredna praktična
primjena u cilju povećanja operativne raspoloživosti. Neposredna korisnost očituje se u
predviđanju kvarova vitalnih podsustava i dijelova dizelskih motora te alociranju potrebnih
remontnih kapaciteta, financijskih potreba i nabave pričuvnih dijelova, kao i u povećanju
kvalitete obuke i osposobljavanja kadrova u eksploataciji i održavanju. Naglašena je i važnost
kvalitetnog vođenja dokumentacije u cilju poboljšanja predloženog modela.This doctoral dissertation provides results of a research on operational reliability and
maintainability of a high speed radial diesel engine. To apply the general theory of reliability on
engines type ‘’Zvezda’’ M 504 B2, which are the most important part of the propulsion
subsystem of missile boats (RTOP 11 and RTOP 12), statistical analysis was performed and
scientific analysis of a large number of data taken from practice and all with the aim of
improving overall reliability of the whole Fleet.
In the analysis proposed model of reliability of missile boats is one where it can be seen all of
its subsystem, but practice has shown that the propulsion subsystem has the greatest impact on
the overall reliability of these ships, and therefore on the war preparedness. Therefore, as a
research subject selected is a main engine which had to be broken down in subsystem and all
based on exploitation data and all in order to ascertain the reliability of each individual
subsystem.
The aim of this research was to determine a reliability function, failure intensity function as well
as maintainability function of the engine in demanding operational conditions in the Adriatic
sea and to compare the results with those obtained by different researches in order to perceive
trends and their scientific relevance.
It has been concluded that the Weibull distribution approximates the best the operational
reliability and the expected trouble-free period of operation can be well approximated by the
mean time between failures.
It has also been found out that the normal distribution best approximates the empirical function
of maintainability or engine reparability and the mean maintenance time complies with the
mean engine repair time by subsystems.
The research results can be applied to other similar boat engines as well as to diesel engines of
different platforms. This doctoral dissertation connects the research results with scientific
reports in the way that theoretical hypotheses from the literature are related to direct
applications in order to improve the operational availability.
Immediate usefulness is evident in predicting failures in vital subsystems and of different diesel
engine parts and in the allocation of necessary ship repair facilities, financial needs and spare
parts procurement as well as in improving the quality of training and qualification. Emphasize
the importance of high-quality document management in order to improve the proposed model
Dependable Embedded Systems
This Open Access book introduces readers to many new techniques for enhancing and optimizing reliability in embedded systems, which have emerged particularly within the last five years. This book introduces the most prominent reliability concerns from today’s points of view and roughly recapitulates the progress in the community so far. Unlike other books that focus on a single abstraction level such circuit level or system level alone, the focus of this book is to deal with the different reliability challenges across different levels starting from the physical level all the way to the system level (cross-layer approaches). The book aims at demonstrating how new hardware/software co-design solution can be proposed to ef-fectively mitigate reliability degradation such as transistor aging, processor variation, temperature effects, soft errors, etc. Provides readers with latest insights into novel, cross-layer methods and models with respect to dependability of embedded systems; Describes cross-layer approaches that can leverage reliability through techniques that are pro-actively designed with respect to techniques at other layers; Explains run-time adaptation and concepts/means of self-organization, in order to achieve error resiliency in complex, future many core systems