89 research outputs found

    Layout regularity metric as a fast indicator of process variations

    Get PDF
    Integrated circuits design faces increasing challenge as we scale down due to the increase of the effect of sensitivity to process variations. Systematic variations induced by different steps in the lithography process affect both parametric and functional yields of the designs. These variations are known, themselves, to be affected by layout topologies. Design for Manufacturability (DFM) aims at defining techniques that mitigate variations and improve yield. Layout regularity is one of the trending techniques suggested by DFM to mitigate process variations effect. There are several solutions to create regular designs, like restricted design rules and regular fabrics. These regular solutions raised the need for a regularity metric. Metrics in literature are insufficient for different reasons; either because they are qualitative or computationally intensive. Furthermore, there is no study relating either lithography or electrical variations to layout regularity. In this work, layout regularity is studied in details and a new geometrical-based layout regularity metric is derived. This metric is verified against lithographic simulations and shows good correlation. Calculation of the metric takes only few minutes on 1mm x 1mm design, which is considered fast compared to the time taken by simulations. This makes it a good candidate for pre-processing the layout data and selecting certain areas of interest for lithographic simulations for faster throughput. The layout regularity metric is also compared against a model that measures electrical variations due to systematic lithographic variations. The validity of using the regularity metric to flag circuits that have high variability using the developed electrical variations model is shown. The regularity metric results compared to the electrical variability model results show matching percentage that can reach 80%, which means that this metric can be used as a fast indicator of designs more susceptible to lithography and hence electrical variations

    Dependable Embedded Systems

    Get PDF
    This Open Access book introduces readers to many new techniques for enhancing and optimizing reliability in embedded systems, which have emerged particularly within the last five years. This book introduces the most prominent reliability concerns from today’s points of view and roughly recapitulates the progress in the community so far. Unlike other books that focus on a single abstraction level such circuit level or system level alone, the focus of this book is to deal with the different reliability challenges across different levels starting from the physical level all the way to the system level (cross-layer approaches). The book aims at demonstrating how new hardware/software co-design solution can be proposed to ef-fectively mitigate reliability degradation such as transistor aging, processor variation, temperature effects, soft errors, etc. Provides readers with latest insights into novel, cross-layer methods and models with respect to dependability of embedded systems; Describes cross-layer approaches that can leverage reliability through techniques that are pro-actively designed with respect to techniques at other layers; Explains run-time adaptation and concepts/means of self-organization, in order to achieve error resiliency in complex, future many core systems

    Reliable Design of Three-Dimensional Integrated Circuits

    Get PDF

    Algorithms and methodologies for interconnect reliability analysis of integrated circuits

    Get PDF
    The phenomenal progress of computing devices has been largely made possible by the sustained efforts of semiconductor industry in innovating techniques for extremely large-scale integration. Indeed, gigantically integrated circuits today contain multi-billion interconnects which enable the transistors to talk to each other -all in a space of few mm2. Such aggressively downscaled components (transistors and interconnects) silently suffer from increasing electric fields and impurities/defects during manufacturing. Compounded by the Gigahertz switching, the challenges of reliability and design integrity remains very much alive for chip designers, with Electro migration (EM) being the foremost interconnect reliability challenge. Traditionally, EM containment revolves around EM guidelines, generated at single-component level, whose non-compliance means that the component fails. Failure usually refers to deformation due to EM -manifested in form of resistance increase, which is unacceptable from circuit performance point of view. Subsequent aspects deal with correct-by-construct design of the chip followed by the signoff-verification of EM reliability. Interestingly, chip designs today have reached a dilemma point of reduced margin between the actual and reliably allowed current densities, versus, comparatively scarce system-failures. Consequently, this research is focused on improved algorithms and methodologies for interconnect reliability analysis enabling accurate and design-specific interpretation of EM events. In the first part, we present a new methodology for logic-IP (cell) internal EM verification: an inadequately attended area in the literature. Our SPICE-correlated model helps in evaluating the cell lifetime under any arbitrary reliability speciation, without generating additional data - unlike the traditional approaches. The model is apt for today's fab less eco-system, where there is a) increasing reuse of standard cells optimized for one market condition to another (e.g., wireless to automotive), as well as b) increasing 3rd party content on the chip requiring a rigorous sign-off. We present results from a 28nm production setup, demonstrating significant violations relaxation and flexibility to allow runtime level reliability retargeting. Subsequently, we focus on an important aspect of connecting the individual component-level failures to that of the system failure. We note that existing EM methodologies are based on serial reliability assumption, which deems the entire system to fail as soon as the first component in the system fails. With a highly redundant circuit topology, that of a clock grid, in perspective, we present algorithms for EM assessment, which allow us to incorporate and quantify the benefit from system redundancies. With the skew metric of clock-grid as a failure criterion, we demonstrate that unless such incorporations are done, chip lifetimes are underestimated by over 2x. This component-to-system reliability bridge is further extended through an extreme order statistics based approach, wherein, we demonstrate that system failures can be approximated by an asymptotic kth-component failure model, otherwise requiring costly Monte Carlo simulations. Using such approach, we can efficiently predict a system-criterion based time to failure within existing EDA frameworks. The last part of the research is related to incorporating the impact of global/local process variation on current densities as well as fundamental physical factors on EM. Through Hermite polynomial chaos based approach, we arrive at novel variations-aware current density models, which demonstrate significant margins (> 30 %) in EM lifetime when compared with the traditional worst case approach. The above research problems have been motivated by the decade-long work experience of the author dealing with reliability issues in industrial SoCs, first at Texas Instruments and later at Qualcomm.L'espectacular progrés dels dispositius de càlcul ha estat possible en gran part als esforços de la indústria dels semiconductors en proposar tècniques innovadores per circuits d'una alta escala d'integració. Els circuits integrats contenen milers de milions d'interconnexions que permeten connectar transistors dins d'un espai de pocs mm2. Tots aquests components estan afectats per camps elèctrics, impureses i defectes durant la seva fabricació. Degut a l’activitat a nivell de Gigahertzs, la fiabilitat i integritat són reptes importants pels dissenyadors de xips, on la Electromigració (EM) és un dels problemes més importants. Tradicionalment, el control de la EM ha girat entorn a directrius a nivell de component. L'incompliment d’alguna de les directrius implica un alt risc de falla. Per falla s'entén la degradació deguda a la EM, que es manifesta en forma d'augment de la resistència, la qual cosa és inacceptable des del punt de vista del rendiment del circuit. Altres aspectes tenen a veure amb la correcta construcció del xip i la verificació de fiabilitat abans d’enviar el xip a fabricar. Avui en dia, el disseny s’enfronta a dilemes importants a l’hora de definir els marges de fiabilitat dels xips. És un compromís entre eficiència i fiabilitat. La recerca en aquesta tesi se centra en la proposta d’algorismes i metodologies per a l'anàlisi de la fiabilitat d'interconnexió que permeten una interpretació precisa i específica d'esdeveniments d'EM. A la primera part de la tesi es presenta una nova metodologia pel disseny correcte-per-construcció i verificació d’EM a l’interior de les cel·les lògiques. Es presenta un model SPICE correlat que ajuda a avaluar el temps de vida de les cel·les segons qualsevol especificació arbitrària de fiabilitat i sense generar cap dada addicional, al contrari del que fan altres tècniques. El model és apte per l'ecosistema d'empreses de disseny quan hi ha a) una reutilització creixent de cel·les estàndard optimitzades per unes condicions de mercat i utilitzades en un altre (p.ex. de wireless a automoció), o b) la utilització de components del xip provinents de terceres parts i que necessiten una verificació rigorosa. Es presenten resultats en una tecnologia de 28nm, demostrant relaxacions significatives de les regles de fiabilitat i flexibilitat per permetre la reavaluació de la fiabilitat en temps d'execució. A continuació, el treball tracta un aspecte important sobre la relació entre les falles dels components i les falles del sistema. S'observa que les tècniques existents es basen en la suposició de fiabilitat en sèrie, que porta el sistema a fallar tant aviat hi ha un component que falla. Pensant en topologies redundants, com la de les graelles de rellotge, es proposen algorismes per l'anàlisi d'EM que permeten quantificar els beneficis de la redundància en el sistema. Utilitzant com a mètrica l’esbiaixi del senyal de rellotge, es demostra que la vida dels xips pot arribar a ser infravalorada per un factor de 2x. Aquest pont de fiabilitat entre component i sistema es perfecciona a través d'una tècnica basada en estadístics d'ordre extrem on es demostra que les falles poden ser aproximades amb un model asimptòtic de fallada de l'ièssim component, evitant així simulacions de Monte Carlo costoses. Amb aquesta tècnica, es pot predir eficientment el temps de fallada a nivell de sistema utilitzant eines industrials. La darrera part de la recerca està relacionada amb avaluar l'impacte de les variacions de procés en les densitats de corrent i factors físics de la EM. Mitjançant una tècnica basada en polinomis d'Hermite s'han obtingut uns nous models de densitat de corrent que mostren millores importants (>30%) en l'estimació de la vida del sistema comprades amb les tècniques basades en el cas pitjor. La recerca d'aquesta tesi ha estat motivada pel treball de l'autor durant més d'una dècada tractant temes de fiabilitat en sistemes, primer a Texas Instruments i després a Qualcomm.Postprint (published version

    Quantifiable Assurance: From IPs to Platforms

    Get PDF
    Hardware vulnerabilities are generally considered more difficult to fix than software ones because they are persistent after fabrication. Thus, it is crucial to assess the security and fix the vulnerabilities at earlier design phases, such as Register Transfer Level (RTL) and gate level. The focus of the existing security assessment techniques is mainly twofold. First, they check the security of Intellectual Property (IP) blocks separately. Second, they aim to assess the security against individual threats considering the threats are orthogonal. We argue that IP-level security assessment is not sufficient. Eventually, the IPs are placed in a platform, such as a system-on-chip (SoC), where each IP is surrounded by other IPs connected through glue logic and shared/private buses. Hence, we must develop a methodology to assess the platform-level security by considering both the IP-level security and the impact of the additional parameters introduced during platform integration. Another important factor to consider is that the threats are not always orthogonal. Improving security against one threat may affect the security against other threats. Hence, to build a secure platform, we must first answer the following questions: What additional parameters are introduced during the platform integration? How do we define and characterize the impact of these parameters on security? How do the mitigation techniques of one threat impact others? This paper aims to answer these important questions and proposes techniques for quantifiable assurance by quantitatively estimating and measuring the security of a platform at the pre-silicon stages. We also touch upon the term security optimization and present the challenges for future research directions

    A novel deep submicron bulk planar sizing strategy for low energy subthreshold standard cell libraries

    Get PDF
    Engineering andPhysical Science ResearchCouncil (EPSRC) and Arm Ltd for providing funding in the form of grants and studentshipsThis work investigates bulk planar deep submicron semiconductor physics in an attempt to improve standard cell libraries aimed at operation in the subthreshold regime and in Ultra Wide Dynamic Voltage Scaling schemes. The current state of research in the field is examined, with particular emphasis on how subthreshold physical effects degrade robustness, variability and performance. How prevalent these physical effects are in a commercial 65nm library is then investigated by extensive modeling of a BSIM4.5 compact model. Three distinct sizing strategies emerge, cells of each strategy are laid out and post-layout parasitically extracted models simulated to determine the advantages/disadvantages of each. Full custom ring oscillators are designed and manufactured. Measured results reveal a close correlation with the simulated results, with frequency improvements of up to 2.75X/2.43X obs erved for RVT/LVT devices respectively. The experiment provides the first silicon evidence of the improvement capability of the Inverse Narrow Width Effect over a wide supply voltage range, as well as a mechanism of additional temperature stability in the subthreshold regime. A novel sizing strategy is proposed and pursued to determine whether it is able to produce a superior complex circuit design using a commercial digital synthesis flow. Two 128 bit AES cores are synthesized from the novel sizing strategy and compared against a third AES core synthesized from a state-of-the-art subthreshold standard cell library used by ARM. Results show improvements in energy-per-cycle of up to 27.3% and frequency improvements of up to 10.25X. The novel subthreshold sizing strategy proves superior over a temperature range of 0 °C to 85 °C with a nominal (20 °C) improvement in energy-per-cycle of 24% and frequency improvement of 8.65X. A comparison to prior art is then performed. Valid cases are presented where the proposed sizing strategy would be a candidate to produce superior subthreshold circuits

    Estudo da eletromigração em circuitos integrados na fase de projeto

    Get PDF
    Orientadores: Roberto Lacerda de Orio, Leandro Tiago ManeraTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: O dano por eletromigração nas interconexões é um gargalo bem conhecido dos circuitos integrados, pois causam problemas de confiabilidade. A operação em temperaturas e densidades de corrente elevadas acelera os danos, aumentando a resistência da interconexão e, portanto, reduzindo a vida útil do circuito. Este problema tem se acentuado com o escalonamento da tecnologia. Para garantir a confiabilidade da interconexão e, como consequência, a confiabilidade do circuito integrado, métodos tradicionais baseados no chamado Efeito Blech e numa densidade de corrente máxima permitida são implementados durante o projeto da interconexão. Esses métodos, no entanto, não levam em consideração o impacto da eletromigração no desempenho do circuito. Neste trabalho, a abordagem tradicional é estendida e um método para avaliar o efeito da eletromigração no desempenho de circuito integrado é desenvolvido. O método é implementado em uma ferramenta que identifica as interconexões críticas em um circuito integrado e sugere larguras adequadas com base em diferentes critérios para mitigar os danos à eletromigração e aumentar a confiabilidade. Além disso, é determinada a variação dos parâmetros de desempenho do circuito conforme a resistência das interconexões aumenta. A ferramenta é incorporada ao fluxo de projeto do circuito integrado e usa os dados dos kits de projeto e relatórios diretamente disponíveis no ambiente de projeto. Uma análise precisa da distribuição de temperatura na estrutura de interconexão é essencial para uma melhor avaliação da confiabilidade da interconexão. Portanto, é implementado um modelo para calcular a temperatura em cada nível de metalização da estrutura de interconexão. A distribuição de temperatura nas camadas de metalização de diferentes tecnologias é investigada. É mostrado que a temperatura no Metal 1 da tecnologia Intel 10 nm aumenta 75 K, 12 K mais alta que no Metal 2. Como esperado, as camadas mais próximas dos transistores sofrem um aumento de temperatura mais significativo. A ferramenta é aplicada para avaliar eletromigração nas interconexões e na robustez de diferentes circuitos, como um oscilador em anel, um circuito gerador de tensão de referência tipo bandgap e um amplificador operacional. O amplificador operacional, em particular, é cuidadosamente estudado. A metodologia proposta identifica interconexões críticas que quando danificadas por eletromigração causam grandes variações no desempenho do circuito. No pior cenário, a frequência de corte do circuito varia 65% em 5 anos de operação. Uma descoberta interessante é que a metodologia proposta identifica interconexões críticas que não seriam identificadas pelos critérios tradicionais. Essas interconexões operam com densidades de corrente abaixo do limite recomendado pelas regras de projeto. No entanto, uma dessas interconexões leva a uma variação de 30% no ganho do amplificador operacional. Em resumo, a ferramenta proposta verificou que dos 20% de caminhos com uma densidade crítica de corrente, apenas 3% degradam significativamente o desempenho do circuito. Este trabalho traz o estudo da confiabilidade das interconexões e de circuitos integrados para a fase de projeto, o que permite avaliar a degradação do desempenho do circuito antecipadamente durante o seu desenvolvimento. A ferramenta desenvolvida permite ao projetista identificar interconexões críticas que não seriam detectadas usando o critério de densidade máxima de corrente, levando a uma análise mais ampla e precisa da robustez de circuitos integradosAbstract: Electromigration damage in interconnects is a well-known bottleneck of integrated circuits, because it causes reliability problems. Operation at high temperatures and current densities accelerates the damage, increasing the interconnect resistance and, therefore, reducing the circuit lifetime. This issue has been accentuated with the technology downscaling. To guarantee the interconnect reliability and, as a consequence, the integrated circuit reliability, traditional methods based on the so-called Blech Effect and on the maximum allowed current density are implemented during interconnect design. These methods, however, do not take into account the impact of the electromigration on the circuit performance. In this work the traditional approach is extended and a method to evaluate the effect of the electromigration in an integrated circuit performance is developed. The method is implemented in a tool which identifies the critical interconnect lines of an integrated circuit and suggests the proper interconnect width based on different criteria to mitigate the electromigration damage and to increase the reliability. In addition, the variation of performance parameters of the circuit as an interconnect resistance changes is determined. The tool is incorporated into the design flow of the integrated circuit and uses the data from design kits and reports directly available from the design environment. An accurate analysis of the temperature distribution on the interconnect structure is essential to a better assessment of the interconnect reliability. Therefore, a model to compute the temperature on each metallization level of the interconnect structure is implemented. The temperature distribution on the metallization layers of different technologies is investigated. It is shown that the temperature in the Metal 1 of the Intel 10 nm can increase by 75 K, 12 K higher than in the Metal 2. As expected, the layers that are closer to the transistors undergo a more significant temperature increase. The tool is applied to evaluate the interconnects and the robustness of different circuits, namely a ring oscillator, a bandgap voltage reference circuit, and an operational amplifier, against electromigration. The operational amplifier, in particular, is thoroughly studied. The proposed methodology identifies critical interconnects which under electromigration cause large variations in the performance of the circuit. In a worst-case scenario, the cutoff frequency of the circuit varies by 65% in 5 years of operation. An interesting finding is that the proposed methodology identifies critical interconnects which would not be identified by the traditional criteria. These interconnects have current densities below the limit recommended by the design rules. Nevertheless, one of such an interconnect leads to a variation of 30% in the gain of the operational amplifier. In summary, the proposed tool verified that from the 20% paths with a critical current density, only 3% degrades significantly the circuit performance. This work brings the study of the reliability of the interconnects and of integrated circuits to the design phase, which provides the assessment of a circuit performance degradation at an early stage of development. The developed tool allows the designer to identify critical interconnects which would not be detected using the maximum current density criterion, leading to more accurate analysis of the robustness of integrated circuitsDoutoradoEletrônica, Microeletrônica e OptoeletrônicaDoutor em Engenharia Elétrica88882.329437/2019-01CAPE

    Thermal Management for Dependable On-Chip Systems

    Get PDF
    This thesis addresses the dependability issues in on-chip systems from a thermal perspective. This includes an explanation and analysis of models to show the relationship between dependability and tempature. Additionally, multiple novel methods for on-chip thermal management are introduced aiming to optimize thermal properties. Analysis of the methods is done through simulation and through infrared thermal camera measurements
    • …
    corecore