Search CORE

4 research outputs found

Développement d'architectures HW/SW tolérantes aux fautes et auto-calibrantes pour les technologies Intégrées 3D

Author: ANGHEL Lorena
PASCA Vladimir
Publication venue
Publication date: 01/01/2013
Field of study

Malgré les avantages de l'intégration 3D, le test, le rendement et la fiabilité des Through-Silicon-Vias (TSVs) restent parmi les plus grands défis pour les systèmes 3D à base de Réseaux-sur-Puce (Network-on-Chip - NoC). Dans cette thèse, une stratégie de test hors-ligne a été proposé pour les interconnections TSV des liens inter-die des NoCs 3D. Pour le TSV Interconnect Built-In Self-Test (TSV-IBIST) on propose une nouvelle stratégie pour générer des vecteurs de test qui permet la détection des fautes structuraux (open et short) et paramétriques (fautes de délaye). Des stratégies de correction des fautes transitoires et permanents sur les TSV sont aussi proposées aux plusieurs niveaux d'abstraction: data link et network. Au niveau data link, des techniques qui utilisent des codes de correction (ECC) et retransmission sont utilisées pour protégé les liens verticales. Des codes de correction sont aussi utilisés pour la protection au niveau network. Les défauts de fabrication ou vieillissement des TSVs sont réparé au niveau data link avec des stratégies à base de redondance et sérialisation. Dans le réseau, les liens inter-die défaillante ne sont pas utilisables et un algorithme de routage tolérant aux fautes est proposé. On peut implémenter des techniques de tolérance aux fautes sur plusieurs niveaux. Les résultats ont montré qu'une stratégie multi-level atteint des très hauts niveaux de fiabilité avec un cout plus bas. Malheureusement, il n'y as pas une solution unique et chaque stratégie a ses avantages et limitations. C'est très difficile d'évaluer tôt dans le design flow les couts et l'impact sur la performance. Donc, une méthodologie d'exploration de la résilience aux fautes est proposée pour les NoC 3D mesh.3D technology promises energy-efficient heterogeneous integrated systems, which may open the way to thousands cores chips. Silicon dies containing processing elements are stacked and connected by vertical wires called Through-Silicon-Vias. In 3D chips, interconnecting an increasing number of processing elements requires a scalable high-performance interconnect solution: the 3D Network-on-Chip. Despite the advantages of 3D integration, testing, reliability and yield remain the major challenges for 3D NoC-based systems. In this thesis, the TSV interconnect test issue is addressed by an off-line Interconnect Built-In Self-Test (IBIST) strategy that detects both structural (i.e. opens, shorts) and parametric faults (i.e. delays and delay due to crosstalk). The IBIST circuitry implements a novel algorithm based on the aggressor-victim scenario and alleviates limitations of existing strategies. The proposed Kth-aggressor fault (KAF) model assumes that the aggressors of a victim TSV are neighboring wires within a distance given by the aggressor order K. Using this model, TSV interconnect tests of inter-die 3D NoC links may be performed for different aggressor order, reducing test times and circuitry complexity. In 3D NoCs, TSV permanent and transient faults can be mitigated at different abstraction levels. In this thesis, several error resilience schemes are proposed at data link and network levels. For transient faults, 3D NoC links can be protected using error correction codes (ECC) and retransmission schemes using error detection (Automatic Retransmission Query) and correction codes (i.e. Hybrid error correction and retransmission).For transients along a source-destination path, ECC codes can be implemented at network level (i.e. Network-level Forward Error Correction). Data link solutions also include TSV repair schemes for faults due to fabrication processes (i.e. TSV-Spare-and-Replace and Configurable Serial Links) and aging (i.e. Interconnect Built-In Self-Repair and Adaptive Serialization) defects. At network-level, the faulty inter-die links of 3D mesh NoCs are repaired by implementing a TSV fault-tolerant routing algorithm. Although single-level solutions can achieve the desired yield / reliability targets, error mitigation can be realized by a combination of approaches at several abstraction levels. To this end, multi-level error resilience strategies have been proposed. Experimental results show that there are cases where this multi-layer strategy pays-off both in terms of cost and performance. Unfortunately, one-fits-all solution does not exist, as each strategy has its advantages and limitations. For system designers, it is very difficult to assess early in the design stages the costs and the impact on performance of error resilience. Therefore, an error resilience exploration (ERX) methodology is proposed for 3D NoCs.SAVOIE-SCD - Bib.électronique (730659901) / SudocGRENOBLE1/INP-Bib.électronique (384210012) / SudocGRENOBLE2/3-Bib.électronique (384219901) / SudocSudocFranceF

OpenGrey Repository

Design-for-Test and Test Optimization Techniques for TSV-based 3D Stacked ICs

Author: Noia Brandon Robert
Publication venue
Publication date: 01/01/2014
Field of study

As integrated circuits (ICs) continue to scale to smaller dimensions, long interconnectshave become the dominant contributor to circuit delay and a significant component ofpower consumption. In order to reduce the length of these interconnects, 3D integrationand 3D stacked ICs (3D SICs) are active areas of research in both academia and industry.3D SICs not only have the potential to reduce average interconnect length and alleviatemany of the problems caused by long global interconnects, but they can offer greater designflexibility over 2D ICs, significant reductions in power consumption and footprint inan era of mobile applications, increased on-chip data bandwidth through delay reduction,and improved heterogeneous integration.Compared to 2D ICs, the manufacture and test of 3D ICs is significantly more complex.Through-silicon vias (TSVs), which constitute the dense vertical interconnects in adie stack, are a source of additional and unique defects not seen before in ICs. At the sametime, testing these TSVs, especially before die stacking, is recognized as a major challenge.The testing of a 3D stack is constrained by limited test access, test pin availability,power, and thermal constraints. Therefore, efficient and optimized test architectures areneeded to ensure that pre-bond, partial, and complete stack testing are not prohibitivelyexpensive.Methods of testing TSVs prior to bonding continue to be a difficult problem due to testaccess and testability issues. Although some built-in self-test (BIST) techniques have beenproposed, these techniques have numerous drawbacks that render them impractical. In this dissertation, a low-cost test architecture is introduced to enable pre-bond TSV test throughTSV probing. This has the benefit of not needing large analog test components on the die,which is a significant drawback of many BIST architectures. Coupled with an optimizationmethod described in this dissertation to create parallel test groups for TSVs, test time forpre-bond TSV tests can be significantly reduced. The pre-bond probing methodology isexpanded upon to allow for pre-bond scan test as well, to enable both pre-bond TSV andstructural test to bring pre-bond known-good-die (KGD) test under a single test paradigm.The addition of boundary registers on functional TSV paths required for pre-bondprobing results in an increase in delay on inter-die functional paths. This cost of testarchitecture insertion can be a significant drawback, especially considering that one benefitof 3D integration is that critical paths can be partitioned between dies to reduce their delay.This dissertation derives a retiming flow that is used to recover the additional delay addedto TSV paths by test cell insertion.Reducing the cost of test for 3D-SICs is crucial considering that more tests are necessaryduring 3D-SIC manufacturing. To reduce test cost, the test architecture and testscheduling for the stack must be optimized to reduce test time across all necessary testinsertions. This dissertation examines three paradigms for 3D integration - hard dies, firmdies, and soft dies, that give varying degrees of control over 2D test architectures on eachdie while optimizing the 3D test architecture. Integer linear programming models are developedto provide an optimal 3D test architecture and test schedule for the dies in the 3Dstack considering any or all post-bond test insertions. Results show that the ILP modelsoutperform other optimization methods across a range of 3D benchmark circuits.In summary, this dissertation targets testing and design-for-test (DFT) of 3D SICs.The proposed techniques enable pre-bond TSV and structural test while maintaining arelatively low test cost. Future work will continue to enable testing of 3D SICs to moveindustry closer to realizing the true potential of 3D integration.Dissertatio

DukeSpace

Conception d'un micro-réseau intégré NOC tolérant les fautes multiples statiques et dynamiques

Author: Gang Yi
Publication venue: HAL CCSD
Publication date: 05/11/2015
Field of study

The quest for higher-performance and low-power consumption has driven the microelectronics' industry race towards aggressive technology scaling and multicore chip designs. In this many-core era, the Network-on-chip (NoCs) becomes the most promising solution for on-chip communication because of its performance scaling with the number of IPs integrated in the chip.Fault tolerance becomes mandatory as the CMOS technology continues shrinking down. The yield and the reliability are more and more affected by factors such as manufacturing defects, process variations, environment variations, cosmic radiations, and so on. As a result, the designs should be able to provide full functionality (e.g. critical systems), or at least allow degraded mode in a context of high failure rates. To accomplish this, the systems should be able to adapt to manufacturing and runtime failures.In this thesis, some techniques are proposed to improve the fault tolerance ability of NoC based circuits working in harsh environments. As previous works allow the handling of one type of fault at a time, we propose here a solution where different kinds of faults can be tolerated concurrently.Considering constraints such as area and power consumption, a fault tolerant adaptive routing algorithm was proposed, which can cope with transient, intermittent and permanent faults. Combined with some existing techniques, like flit retransmission and packet fragmentation, this approach allows tolerating numerous static and dynamic faults. Simulations results show that the proposed solution allows a high packet delivery success rate: for a 16x16 2D Mesh NoC, 97.68% in the presence of 384 simultaneous link faults, and 93.40% with the presence of 103 simultaneous router faults. This success rate is even higher when this algorithm is extended to NoCs with Tore topology. Another contribution of this thesis is the inclusion of a congestion management function in the proposed routing algorithm. For this purpose, we introduce a novel metric of congestion measurement named Flit Remain. The experimental results show that using this new congestion metric allows a reduction of the average latency of the Network on Chip from 2.5% to 16.1% when compared to the existing metrics.The combination of static and dynamic fault tolerant and adaptive routing and the congestion management offers a solution, which allows designing a NoC highly resilient.Les progrès dans les technologies à base de semi-conducteurs et la demande croissante de puissance de calcul poussent vers une intégration dans une même puce de plus en plus de processeurs intégrés. Par conséquent les réseaux sur puce remplacent progressivement les bus de communication, ceux-ci offrant plus de débit et permettant une mise à l'échelle simplifiée. Parallèlement, la réduction de la finesse de gravure entraine une augmentation de la sensibilité des circuits au processus de fabrication et à son environnement d'utilisation. Les défauts de fabrication et le taux de défaillances pendant la durée de vie du circuit augmentent lorsque l'on passe d'une technologie à une autre. Intégrer des techniques de tolérance aux fautes dans un circuit devient indispensable, en particulier pour les circuits évoluant dans un environnement très sensible (aérospatial, automobile, santé, ...). Nous présentons dans ce travail de thèse, des techniques permettant d'améliorer la tolérance aux fautes des micro-réseaux intégrés dans des circuits évoluant dans un environnement difficile. Le NoC doit ainsi être capable de s'affranchir de la présence de nombreuses fautes. Les travaux publiés jusqu'ici proposaient des solutions pour un seul type de faute. En considérant les contraintes de surface et de consommation du domaine de l'embarqué, nous avons proposé un algorithme de routage adaptatif tolérant à la fois les fautes intermittentes, transitoires et permanentes. En combinant et adaptant des techniques existantes de retransmission de flits, de fragmentation et de regroupement de paquet, notre approche permet de s'affranchir de nombreuses fautes statiques et dynamiques. Les très nombreuses simulations réalisées ont permis de montrer entre autre que, l'algorithme proposé permet d'atteindre un taux de livraison de paquets de 97,68% pour un NoC 16x16 en maille 2D en présence de 384 liens défectueux simultanés, et 93,40% lorsque 103 routeurs sont défaillants. Nous avons étendu l'algorithme aux topologies de type tore avec des résultats bien meilleurs.Une autre originalité de cette thèse est que nous avons inclus dans cet algorithme une fonction de gestion de la congestion. Pour cela nous avons défini une nouvelle métrique de mesure de la congestion (Flit Remain) plus pertinente que les métriques utilisées et publiées jusqu'ici. Les expériences ont montré que l'utilisation de cette métrique permet de réduire la latence (au niveau du pic de saturation) de 2,5 % à 16,1 %, selon le type de trafic généré, par rapport à la plus efficace des métriques existante. La combinaison du routage adaptatif tolérant les fautes statiques et dynamiques et la gestion de la congestion offrent une solution qui permet d'avoir un NoC et par extension un circuit beaucoup plus résilient

Thèses en Ligne

Hal - Université Grenoble Alpes

HAL Descartes

Interconnect Built-In Self-Repair and Adaptive-Serialization (I-BIRAS) for 3D integrated systems

Author: Anghel Lorena
Nicolaidis M.
Pasca V.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/07/2010
Field of study

ISBN 978-1-4244-7724-1International audienceThe high defect rates of the TSV manufacturing processes lead to poor yield. Interconnect repair and serialization techniques were proposed to improve yield. In these papers the control of the repair and serialization circuitry are determined off-chip and are stored in one-time-programmable memories. In this work we present an Interconnect Built-in Self-Repair and Adaptive-Serialization approach (I-BIRAS), where interconnect repair and data serialization/deserialization is performed without external intervention (reducing cost of external equipment) and can be executed at any time (after fabrication and all along system life), thus coping with both fabrication and system-life defects

Crossref

Hal - Université Grenoble Alpes