195 research outputs found

    A Hybrid Test Architecture to Reduce Test Application Time in Full Scan Sequential Circuits

    Get PDF
    Abstract—Full scan based design technique is widely used to alleviate the complexity of test generation for sequential circuits. However, this approach leads to substantial increase in test application time, because of serial loading of vectors. Although BIST based approaches offer faster testing, they usually suffer from low fault coverage. In this paper, we propose a hybrid test architecture, which achieves significant reduction in test application time. The test suite consists of: (i) some external deterministic test vectors to be scanned in, and (ii) internally generated responses of the CUT to be re-applied as tests iteratively, in functional (non-scan) mode. The proposed architecture uses only combinational ATPG to hybridize deterministic testing and test per clock BIST, and thus makes good use of both scan based and non-scan testing. We also present a bipartite graph based heuristic to select the deterministic test vectors and sequential fault simulation technique is used to perform the exact analysis on detected faults during the re-application of internally generated responses of the CUT during testing. Experimental results on ISCAS-89 benchmark circuits show the efficacy of the heuristic and reveal a significant reduction of test application time

    Delay Measurements and Self Characterisation on FPGAs

    No full text
    This thesis examines new timing measurement methods for self delay characterisation of Field-Programmable Gate Arrays (FPGAs) components and delay measurement of complex circuits on FPGAs. Two novel measurement techniques based on analysis of a circuit's output failure rate and transition probability is proposed for accurate, precise and efficient measurement of propagation delays. The transition probability based method is especially attractive, since it requires no modifications in the circuit-under-test and requires little hardware resources, making it an ideal method for physical delay analysis of FPGA circuits. The relentless advancements in process technology has led to smaller and denser transistors in integrated circuits. While FPGA users benefit from this in terms of increased hardware resources for more complex designs, the actual productivity with FPGA in terms of timing performance (operating frequency, latency and throughput) has lagged behind the potential improvements from the improved technology due to delay variability in FPGA components and the inaccuracy of timing models used in FPGA timing analysis. The ability to measure delay of any arbitrary circuit on FPGA offers many opportunities for on-chip characterisation and physical timing analysis, allowing delay variability to be accurately tracked and variation-aware optimisations to be developed, reducing the productivity gap observed in today's FPGA designs. The measurement techniques are developed into complete self measurement and characterisation platforms in this thesis, demonstrating their practical uses in actual FPGA hardware for cross-chip delay characterisation and accurate delay measurement of both complex combinatorial and sequential circuits, further reinforcing their positions in solving the delay variability problem in FPGAs

    Autonomous Recovery Of Reconfigurable Logic Devices Using Priority Escalation Of Slack

    Get PDF
    Field Programmable Gate Array (FPGA) devices offer a suitable platform for survivable hardware architectures in mission-critical systems. In this dissertation, active dynamic redundancy-based fault-handling techniques are proposed which exploit the dynamic partial reconfiguration capability of SRAM-based FPGAs. Self-adaptation is realized by employing reconfiguration in detection, diagnosis, and recovery phases. To extend these concepts to semiconductor aging and process variation in the deep submicron era, resilient adaptable processing systems are sought to maintain quality and throughput requirements despite the vulnerabilities of the underlying computational devices. A new approach to autonomous fault-handling which addresses these goals is developed using only a uniplex hardware arrangement. It operates by observing a health metric to achieve Fault Demotion using Recon- figurable Slack (FaDReS). Here an autonomous fault isolation scheme is employed which neither requires test vectors nor suspends the computational throughput, but instead observes the value of a health metric based on runtime input. The deterministic flow of the fault isolation scheme guarantees success in a bounded number of reconfigurations of the FPGA fabric. FaDReS is then extended to the Priority Using Resource Escalation (PURE) online redundancy scheme which considers fault-isolation latency and throughput trade-offs under a dynamic spare arrangement. While deep-submicron designs introduce new challenges, use of adaptive techniques are seen to provide several promising avenues for improving resilience. The scheme developed is demonstrated by hardware design of various signal processing circuits and their implementation on a Xilinx Virtex-4 FPGA device. These include a Discrete Cosine Transform (DCT) core, Motion Estimation (ME) engine, Finite Impulse Response (FIR) Filter, Support Vector Machine (SVM), and Advanced Encryption Standard (AES) blocks in addition to MCNC benchmark circuits. A iii significant reduction in power consumption is achieved ranging from 83% for low motion-activity scenes to 12.5% for high motion activity video scenes in a novel ME engine configuration. For a typical benchmark video sequence, PURE is shown to maintain a PSNR baseline near 32dB. The diagnosability, reconfiguration latency, and resource overhead of each approach is analyzed. Compared to previous alternatives, PURE maintains a PSNR within a difference of 4.02dB to 6.67dB from the fault-free baseline by escalating healthy resources to higher-priority signal processing functions. The results indicate the benefits of priority-aware resiliency over conventional redundancy approaches in terms of fault-recovery, power consumption, and resource-area requirements. Together, these provide a broad range of strategies to achieve autonomous recovery of reconfigurable logic devices under a variety of constraints, operating conditions, and optimization criteria

    정적 램 및 파워 게이트 회로에 대한 전압 및 보존용 공간 할당 문제

    Get PDF
    학위논문(박사) -- 서울대학교대학원 : 공과대학 전기·정보공학부, 2021.8. 김태환.칩의 저전력 동작은 중요한 문제이며, 공정이 발전하면서 그 중요성은 점점 커지고 있다. 본 논문은 칩을 구성하는 정적 램(SRAM) 및 로직(logic) 각각에 대해서 저전력으로 동작시키는 방법론을 논한다. 우선, 본 논문에서는 칩을 문턱 전압 근처의 전압(NTV)에서 동작시키고자 할 때 모니터링 회로의 측정을 통해 칩 내의 모든 SRAM 블록에서 동작 실패가 발생하지 않는 최소 동작 전압을 추론하는 방법론을 제안한다. 칩을 NTV 영역에서 동작시키는 것은 에너지 효율성을 증대시킬 수 있는 매우 효과적인 방법 중 하나이지만 SRAM의 경우 동작 실패 때문에 동작 전압을 낮추기 어렵다. 하지만 칩마다 영향을 받는 공정 변이가 다르므로 최소 동작 전압은 칩마다 다르며, 모니터링을 통해 이를 추론해낼 수 있다면 칩별로 SRAM에 서로 다른 전압을 인가해 에너지 효율성을 높일 수 있다. 본 논문에서는 다음과 같은 과정을 통해 이 문제를 해결한다: (1) 디자인 인프라 설계 단계에서는 SRAM의 최소 동작 전압을 추론하고 칩 생산 단계에서는 SRAM 모니터의 측정을 통해 전압을 인가하는 방법론을 제안한다; (2) 칩의 SRAM 비트셀(bitcell)과 주변 회로를 포함한 SRAM 블록들의 공정 변이를 모니터링할 수 있는 SRAM 모니터와 SRAM 모니터에서 모니터링할 대상을 정의한다; (3) SRAM 모니터의 측정값을 이용해 같은 칩에 존재하는 모든 SRAM 블록에서 목표 신뢰수준 내에서 읽기, 쓰기, 및 접근 동작 실패가 발생하지 않는 최소 동작 전압을 추론한다. 벤치마크 회로의 실험 결과는 본 논문에서 제안한 방법을 따라 칩별로 SRAM 블록들의 최소 동작 전압을 다르게 인가할 경우, 기존 방법대로 모든 칩에 동일한 전압을 인가하는 것 대비 수율은 같은 수준으로 유지하면서 SRAM 비트셀 배열의 전력 소모를 감소시킬 수 있음을 보인다. 두 번째로, 본 논문에서는 파워 게이트 회로에서 기존의 보존용 공간 할당 방법들이 지니고 있는 문제를 해결하고 누설 전력 소모를 더 줄일 수 있는 방법론을 제안한다. 기존의 보존용 공간 할당 방법은 멀티플렉서 피드백 루프가 있는 모든 플립플롭에는 무조건 보존용 공간을 할당해야 해야 하기 때문에 다중 비트 보존용 공간의 장점을 충분히 살리지 못하는 문제가 있다. 본 논문에서는 다음과 같은 방법을 통해 보존용 공간을 최소화하는 문제를 해결한다: (1) 보존용 공간 할당 과정에서 멀티플렉서 피드백 루프를 무시할 수 있는 조건을 제시하고, (2) 해당 조건을 이용해 멀티플렉서 피드백 루프가 있는 플립플롭이 많이 존재하는 회로에서 보존용 공간을 최소화한다; (3) 추가로, 플립플롭에 이미 할당된 보존용 공간 중 일부를 제거할 수 있는 조건을 찾고, 이를 이용해 보존용 공간을 더 감소시킨다. 벤치마크 회로의 실험 결과는 본 논문에서 제안한 방법론이 기존의 보존용 공간 할당 방법론보다 더 적은 보존용 공간을 할당하며, 따라서 칩의 면적 및 전력 소모를 감소시킬 수 있음을 보인다.Low power operation of a chip is an important issue, and its importance is increasing as the process technology advances. This dissertation addresses the methodology of operating at low power for each of the SRAM and logic constituting the chip. Firstly, we propose a methodology to infer the minimum operating voltage at which SRAM failure does not occur in all SRAM blocks in the chip operating on near threshold voltage (NTV) regime through the measurement of a monitoring circuit. Operating the chip on NTV regime is one of the most effective ways to increase energy efficiency, but in case of SRAM, it is difficult to lower the operating voltage because of SRAM failure. However, since the process variation on each chip is different, the minimum operating voltage is also different for each chip. If it is possible to infer the minimum operating voltage of SRAM blocks of each chip through monitoring, energy efficiency can be increased by applying different voltage. In this dissertation, we propose a new methodology of resolving this problem. Specifically, (1) we propose to infer minimum operation voltage of SRAM in design infra development phase, and assign the voltage using measurement of SRAM monitor in silicon production phase; (2) we define a SRAM monitor and features to be monitored that can monitor process variation on SRAM blocks including SRAM bitcell and peripheral circuits; (3) we propose a new methodology of inferring minimum operating voltage of SRAM blocks in a chip that does not cause read, write, and access failures under a target confidence level. Through experiments with benchmark circuits, it is confirmed that applying different voltage to SRAM blocks in each chip that inferred by our proposed methodology can save overall power consumption of SRAM bitcell array compared to applying same voltage to SRAM blocks in all chips, while meeting the same yield target. Secondly, we propose a methodology to resolve the problem of the conventional retention storage allocation methods and thereby further reduce leakage power consumption of power gated circuit. Conventional retention storage allocation methods have problem of not fully utilizing the advantage of multi-bit retention storage because of the unavoidable allocation of retention storage on flip-flops with mux-feedback loop. In this dissertation, we propose a new methodology of breaking the bottleneck of minimizing the state retention storage. Specifically, (1) we find a condition that mux-feedback loop can be disregarded during the retention storage allocation; (2) utilizing the condition, we minimize the retention storage of circuits that contain many flip-flops with mux-feedback loop; (3) we find a condition to remove some of the retention storage already allocated to each of flip-flops and propose to further reduce the retention storage. Through experiments with benchmark circuits, it is confirmed that our proposed methodology allocates less retention storage compared to the state-of-the-art methods, occupying less cell area and consuming less power.1 Introduction 1 1.1 Low Voltage SRAM Monitoring Methodology 1 1.2 Retention Storage Allocation on Power Gated Circuit 5 1.3 Contributions of this Dissertation 8 2 SRAM On-Chip Monitoring Methodology for High Yield and Energy Efficient Memory Operation at Near Threshold Voltage 13 2.1 SRAM Failures 13 2.1.1 Read Failure 13 2.1.2 Write Failure 15 2.1.3 Access Failure 16 2.1.4 Hold Failure 16 2.2 SRAM On-chip Monitoring Methodology: Bitcell Variation 18 2.2.1 Overall Flow 18 2.2.2 SRAM Monitor and Monitoring Target 18 2.2.3 Vfail to Vddmin Inference 22 2.3 SRAM On-chip Monitoring Methodology: Peripheral Circuit IR Drop and Variation 29 2.3.1 Consideration of IR Drop 29 2.3.2 Consideration of Peripheral Circuit Variation 30 2.3.3 Vddmin Prediction including Access Failure Prohibition 33 2.4 Experimental Results 41 2.4.1 Vddmin Considering Read and Write Failures 42 2.4.2 Vddmin Considering Read/Write and Access Failures 45 2.4.3 Observation for Practical Use 45 3 Allocation of Always-On State Retention Storage for Power Gated Circuits - Steady State Driven Approach 49 3.1 Motivations and Analysis 49 3.1.1 Impact of Self-loop on Power Gating 49 3.1.2 Circuit Behavior Before Sleeping 52 3.1.3 Wakeup Latency vs. Retention Storage 54 3.2 Steady State Driven Retention Storage Allocation 56 3.2.1 Extracting Steady State Self-loop FFs 57 3.2.2 Allocating State Retention Storage 59 3.2.3 Designing and Optimizing Steady State Monitoring Logic 59 3.2.4 Analysis of the Impact of Steady State Monitoring Time on the Standby Power 63 3.3 Retention Storage Refinement Utilizing Steadiness 65 3.3.1 Extracting Flip-flops for Retention Storage Refinement 66 3.3.2 Designing State Monitoring Logic and Control Signals 68 3.4 Experimental Results 73 3.4.1 Comparison of State Retention Storage 75 3.4.2 Comparison of Power Consumption 79 3.4.3 Impact on Circuit Performance 82 3.4.4 Support for Immediate Power Gating 83 4 Conclusions 89 4.1 Chapter 2 89 4.2 Chapter 3 90박

    Test and Testability of Asynchronous Circuits

    Full text link
    The ever-increasing transistor shrinkage and higher clock frequencies are causing serious clock distribution, power management, and reliability issues. Asynchronous design is predicted to have a significant role in tackling these challenges because of its distributed control mechanism and on-demand, rather than continuous, switching activity. Null Convention Logic (NCL) is a robust and low-power asynchronous paradigm that introduces new challenges to test and testability algorithms because 1) the lack of deterministic timing in NCL complicates the management of test timing, 2) all NCL gates are state-holding and even simple combinational circuits show sequential behaviour, and 3) stuck-at faults on gate internal feedback (GIF) of NCL gates do not always cause an incorrect output and therefore are undetectable by automatic test pattern generation (ATPG) algorithms. Existing test methods for NCL use clocked hardware to control the timing of test. Such test hardware could introduce metastability issues into otherwise highly robust NCL devices. Also, existing test techniques for NCL handle the high-statefulness of NCL circuits by excessive incorporation of test hardware which imposes additional area, propagation delay and power consumption. This work, first, proposes a clockless self-timed ATPG that detects all faults on the gate inputs and a share of the GIF faults with no added design for test (DFT). Then, the efficacy of quiescent current (IDDQ) test for detecting GIF faults undetectable by a DFT-less ATPG is investigated. Finally, asynchronous test hardware, including test points, a scan cell, and an interleaved scan architecture, is proposed for NCL-based circuits. To the extent of our knowledge, this is the first work that develops clockless, self-timed test techniques for NCL while minimising the need for DFT, and also the first work conducted on IDDQ test of NCL. The proposed methods are applied to multiple NCL circuits with up to 2,633 NCL gates (10,000 CMOS Boolean gates), in 180 and 45 nm technologies and show average fault coverage of 88.98% for ATPG alone, 98.52% including IDDQ test, and 99.28% when incorporating test hardware. Given that this fault coverage includes detection of GIF faults, our work has 13% higher fault coverage than previous work. Also, because our proposed clockless test hardware eliminates the need for double-latching, it reduces the average area and delay overhead of previous studies by 32% and 50%, respectively

    Degradation in FPGAs: Monitoring, Modeling and Mitigation

    Get PDF
    This dissertation targets the transistor aging degradation as well as the associated thermal challenges in FPGAs (since there is an exponential relation between aging and chip temperature). The main objectives are to perform experimentation, analysis and device-level model abstraction for modeling the degradation in FPGAs, then to monitor the FPGA to keep track of aging rates and ultimately to propose an aging-aware FPGA design flow to mitigate the aging

    Test and Diagnosis of Integrated Circuits

    Get PDF
    The ever-increasing growth of the semiconductor market results in an increasing complexity of digital circuits. Smaller, faster, cheaper and low-power consumption are the main challenges in semiconductor industry. The reduction of transistor size and the latest packaging technology (i.e., System-On-a-Chip, System-In-Package, Trough Silicon Via 3D Integrated Circuits) allows the semiconductor industry to satisfy the latest challenges. Although producing such advanced circuits can benefit users, the manufacturing process is becoming finer and denser, making chips more prone to defects.The work presented in the HDR manuscript addresses the challenges of test and diagnosis of integrated circuits. It covers:- Power aware test;- Test of Low Power Devices;- Fault Diagnosis of digital circuits

    Algorithms for Power Aware Testing of Nanometer Digital ICs

    Get PDF
    At-speed testing of deep-submicron digital very large scale integrated (VLSI) circuits has become mandatory to catch small delay defects. Now, due to continuous shrinking of complementary metal oxide semiconductor (CMOS) transistor feature size, power density grows geometrically with technology scaling. Additionally, power dissipation inside a digital circuit during the testing phase (for test vectors under all fault models (Potluri, 2015)) is several times higher than its power dissipation during the normal functional phase of operation. Due to this, the currents that flow in the power grid during the testing phase, are much higher than what the power grid is designed for (the functional phase of operation). As a result, during at-speed testing, the supply grid experiences unacceptable supply IR-drop, ultimately leading to delay failures during at-speed testing. Since these failures are specific to testing and do not occur during functional phase of operation of the chip, these failures are usually referred to false failures, and they reduce the yield of the chip, which is undesirable. In nanometer regime, process parameter variations has become a major problem. Due to the variation in signalling delays caused by these variations, it is important to perform at-speed testing even for stuck faults, to reduce the test escapes (McCluskey and Tseng, 2000; Vorisek et al., 2004). In this context, the problem of excessive peak power dissipation causing false failures, that was addressed previously in the context of at-speed transition fault testing (Saxena et al., 2003; Devanathan et al., 2007a,b,c), also becomes prominent in the context of at-speed testing of stuck faults (Maxwell et al., 1996; McCluskey and Tseng, 2000; Vorisek et al., 2004; Prabhu and Abraham, 2012; Potluri, 2015; Potluri et al., 2015). It is well known that excessive supply IR-drop during at-speed testing can be kept under control by minimizing switching activity during testing (Saxena et al., 2003). There is a rich collection of techniques proposed in the past for reduction of peak switching activity during at-speed testing of transition/delay faults ii in both combinational and sequential circuits. As far as at-speed testing of stuck faults are concerned, while there were some techniques proposed in the past for combinational circuits (Girard et al., 1998; Dabholkar et al., 1998), there are no techniques concerning the same for sequential circuits. This thesis addresses this open problem. We propose algorithms for minimization of peak switching activity during at-speed testing of stuck faults in sequential digital circuits under the combinational state preservation scan (CSP-scan) architecture (Potluri, 2015; Potluri et al., 2015). First, we show that, under this CSP-scan architecture, when the test set is completely specified, the peak switching activity during testing can be minimized by solving the Bottleneck Traveling Salesman Problem (BTSP). This mapping of peak test switching activity minimization problem to BTSP is novel, and proposed for the first time in the literature. Usually, as circuit size increases, the percentage of don’t cares in the test set increases. As a result, test vector ordering for any arbitrary filling of don’t care bits is insufficient for producing effective reduction in switching activity during testing of large circuits. Since don’t cares dominate the test sets for larger circuits, don’t care filling plays a crucial role in reducing switching activity during testing. Taking this into consideration, we propose an algorithm, XStat, which is capable of performing test vector ordering while preserving don’t care bits in the test vectors, following which, the don’t cares are filled in an intelligent fashion for minimizing input switching activity, which effectively minimizes switching activity inside the circuit (Girard et al., 1998). Through empirical validation on benchmark circuits, we show that XStat minimizes peak switching activity significantly, during testing. Although XStat is a very powerful heuristic for minimizing peak input-switchingactivity, it will not guarantee optimality. To address this issue, we propose an algorithm that uses Dynamic Programming to calculate the lower bound for a given sequence of test vectors, and subsequently uses a greedy strategy for filling don’t cares in this sequence to achieve this lower bound, thereby guaranteeing optimality. This algorithm, which we refer to as DP-fill in this thesis, provides the globally optimal solution for minimizing peak input-switching-activity and also is the best known in the literature for minimizing peak input-switching-activity during testing. The proof of optimality of DP-fill in minimizing peak input-switching-activity is also provided in this thesis

    Conception et test des circuits et systèmes numériques à haute fiabilité et sécurité

    Get PDF
    Research activities I carried on after my nomination as Chargé de Recherche deal with the definition of methodologies and tools for the design, the test and the reliability of secure digital circuits and trustworthy manufacturing. More recently, we have started a new research activity on the test of 3D stacked Integrated CIrcuits, based on the use of Through Silicon Vias. Moreover, thanks to the relationships I have maintained after my post-doc in Italy, I have kept on cooperating with Politecnico di Torino on the topics related to test and reliability of memories and microprocessors.Secure and Trusted DevicesSecurity is a critical part of information and communication technologies and it is the necessary basis for obtaining confidentiality, authentication, and integrity of data. The importance of security is confirmed by the extremely high growth of the smart-card market in the last 20 years. It is reported in "Le monde Informatique" in the article "Computer Crime and Security Survey" in 2007 that financial losses due to attacks on "secure objects" in the digital world are greater than $11 Billions. Since the race among developers of these secure devices and attackers accelerates, also due to the heterogeneity of new systems and their number, the improvement of the resistance of such components becomes today’s major challenge.Concerning all the possible security threats, the vulnerability of electronic devices that implement cryptography functions (including smart cards, electronic passports) has become the Achille’s heel in the last decade. Indeed, even though recent crypto-algorithms have been proven resistant to cryptanalysis, certain fraudulent manipulations on the hardware implementing such algorithms can allow extracting confidential information. So-called Side-Channel Attacks have been the first type of attacks that target the physical device. They are based on information gathered from the physical implementation of a cryptosystem. For instance, by correlating the power consumed and the data manipulated by the device, it is possible to discover the secret encryption key. Nevertheless, this point is widely addressed and integrated circuit (IC) manufacturers have already developed different kinds of countermeasures.More recently, new threats have menaced secure devices and the security of the manufacturing process. A first issue is the trustworthiness of the manufacturing process. From one side, secure devices must assure a very high production quality in order not to leak confidential information due to a malfunctioning of the device. Therefore, possible defects due to manufacturing imperfections must be detected. This requires high-quality test procedures that rely on the use of test features that increases the controllability and the observability of inner points of the circuit. Unfortunately, this is harmful from a security point of view, and therefore the access to these test features must be protected from unauthorized users. Another harm is related to the possibility for an untrusted manufacturer to do malicious alterations to the design (for instance to bypass or to disable the security fence of the system). Nowadays, many steps of the production cycle of a circuit are outsourced. For economic reasons, the manufacturing process is often carried out by foundries located in foreign countries. The threat brought by so-called Hardware Trojan Horses, which was long considered theoretical, begins to materialize.A second issue is the hazard of faults that can appear during the circuit’s lifetime and that may affect the circuit behavior by way of soft errors or deliberate manipulations, called Fault Attacks. They can be based on the intentional modification of the circuit’s environment (e.g., applying extreme temperature, exposing the IC to radiation, X-rays, ultra-violet or visible light, or tampering with clock frequency) in such a way that the function implemented by the device generates an erroneous result. The attacker can discover secret information by comparing the erroneous result with the correct one. In-the-field detection of any failing behavior is therefore of prime interest for taking further action, such as discontinuing operation or triggering an alarm. In addition, today’s smart cards use 90nm technology and according to the various suppliers of chip, 65nm technology will be effective on the horizon 2013-2014. Since the energy required to force a transistor to switch is reduced for these new technologies, next-generation secure systems will become even more sensitive to various classes of fault attacks.Based on these considerations, within the group I work with, we have proposed new methods, architectures and tools to solve the following problems:• Test of secure devices: unfortunately, classical techniques for digital circuit testing cannot be easily used in this context. Indeed, classical testing solutions are based on the use of Design-For-Testability techniques that add hardware components to the circuit, aiming to provide full controllability and observability of internal states. Because crypto‐ processors and others cores in a secure system must pass through high‐quality test procedures to ensure that data are correctly processed, testing of crypto chips faces a dilemma. In fact design‐for‐testability schemes want to provide high controllability and observability of the device while security wants minimal controllability and observability in order to hide the secret. We have therefore proposed, form one side, the use of enhanced scan-based test techniques that exploit compaction schemes to reduce the observability of internal information while preserving the high level of testability. From the other side, we have proposed the use of Built-In Self-Test for such devices in order to avoid scan chain based test.• Reliability of secure devices: we proposed an on-line self-test architecture for hardware implementation of the Advanced Encryption Standard (AES). The solution exploits the inherent spatial replications of a parallel architecture for implementing functional redundancy at low cost.• Fault Attacks: one of the most powerful types of attack for secure devices is based on the intentional injection of faults (for instance by using a laser beam) into the system while an encryption occurs. By comparing the outputs of the circuits with and without the injection of the fault, it is possible to identify the secret key. To face this problem we have analyzed how to use error detection and correction codes as counter measure against this type of attack, and we have proposed a new code-based architecture. Moreover, we have proposed a bulk built-in current-sensor that allows detecting the presence of undesired current in the substrate of the CMOS device.• Fault simulation: to evaluate the effectiveness of countermeasures against fault attacks, we developed an open source fault simulator able to perform fault simulation for the most classical fault models as well as user-defined electrical level fault models, to accurately model the effect of laser injections on CMOS circuits.• Side-Channel attacks: they exploit physical data-related information leaking from the device (e.g. current consumption or electro-magnetic emission). One of the most intensively studied attacks is the Differential Power Analysis (DPA) that relies on the observation of the chip power fluctuations during data processing. I studied this type of attack in order to evaluate the influence of the countermeasures against fault attack on the power consumption of the device. Indeed, the introduction of countermeasures for one type of attack could lead to the insertion of some circuitry whose power consumption is related to the secret key, thus allowing another type of attack more easily. We have developed a flexible integrated simulation-based environment that allows validating a digital circuit when the device is attacked by means of this attack. All architectures we designed have been validated through this tool. Moreover, we developed a methodology that allows to drastically reduce the time required to validate countermeasures against this type of attack.TSV- based 3D Stacked Integrated Circuits TestThe stacking process of integrated circuits using TSVs (Through Silicon Via) is a promising technology that keeps the development of the integration more than Moore’s law, where TSVs enable to tightly integrate various dies in a 3D fashion. Nevertheless, 3D integrated circuits present many test challenges including the test at different levels of the 3D fabrication process: pre-, mid-, and post- bond tests. Pre-bond test targets the individual dies at wafer level, by testing not only classical logic (digital logic, IOs, RAM, etc) but also unbounded TSVs. Mid-bond test targets the test of partially assembled 3D stacks, whereas finally post-bond test targets the final circuit.The activities carried out within this topic cover 2 main issues:• Pre-bond test of TSVs: the electrical model of a TSV buried within the substrate of a CMOS circuit is a capacitance connected to ground (when the substrate is connected to ground). The main assumption is that a defect may affect the value of that capacitance. By measuring the variation of the capacitance’s value it is possible to check whether the TSV is correctly fabricated or not. We have proposed a method to measure the value of the capacitance based on the charge/ discharge delay of the RC network containing the TSV.• Test infrastructures for 3D stacked Integrated Circuits: testing a die before stacking to another die introduces the problem of a dynamic test infrastructure, where test data must be routed to a specific die based on the reached fabrication step. New solutions are proposed in literature that allow reconfiguring the test paths within the circuit, based on on-the-fly requirements. We have started working on an extension of the IEEE P1687 test standard that makes use of an automatic die-detection based on pull-up resistors.Memory and Microprocessor Test and ReliabilityThanks to device shrinking and miniaturization of fabrication technology, performances of microprocessors and of memories have grown of more than 5 magnitude order in the last 30 years. With this technology trend, it is necessary to face new problems and challenges, such as reliability, transient errors, variability and aging.In the last five years I’ve worked in cooperation with the Testgroup of Politecnico di Torino (Italy) to propose a new method to on-line validate the correctness of the program execution of a microprocessor. The main idea is to monitor a small set of control signals of the processors in order to identify incorrect activation sequences. This approach can detect both permanent and transient errors of the internal logic of the processor.Concerning the test of memories, we have proposed a new approach to automatically generate test programs starting from a functional description of the possible faults in the memory.Moreover, we proposed a new methodology, based on microprocessor error probability profiling, that aims at estimating fault injection results without the need of a typical fault injection setup. The proposed methodology is based on two main ideas: a one-time fault-injection analysis of the microprocessor architecture to characterize the probability of successful execution of each of its instructions in presence of a soft-error, and a static and very fast analysis of the control and data flow of the target software application to compute its probability of success

    Développement d'architectures HW/SW tolérantes aux fautes et auto-calibrantes pour les technologies Intégrées 3D

    Get PDF
    Malgré les avantages de l'intégration 3D, le test, le rendement et la fiabilité des Through-Silicon-Vias (TSVs) restent parmi les plus grands défis pour les systèmes 3D à base de Réseaux-sur-Puce (Network-on-Chip - NoC). Dans cette thèse, une stratégie de test hors-ligne a été proposé pour les interconnections TSV des liens inter-die des NoCs 3D. Pour le TSV Interconnect Built-In Self-Test (TSV-IBIST) on propose une nouvelle stratégie pour générer des vecteurs de test qui permet la détection des fautes structuraux (open et short) et paramétriques (fautes de délaye). Des stratégies de correction des fautes transitoires et permanents sur les TSV sont aussi proposées aux plusieurs niveaux d'abstraction: data link et network. Au niveau data link, des techniques qui utilisent des codes de correction (ECC) et retransmission sont utilisées pour protégé les liens verticales. Des codes de correction sont aussi utilisés pour la protection au niveau network. Les défauts de fabrication ou vieillissement des TSVs sont réparé au niveau data link avec des stratégies à base de redondance et sérialisation. Dans le réseau, les liens inter-die défaillante ne sont pas utilisables et un algorithme de routage tolérant aux fautes est proposé. On peut implémenter des techniques de tolérance aux fautes sur plusieurs niveaux. Les résultats ont montré qu'une stratégie multi-level atteint des très hauts niveaux de fiabilité avec un cout plus bas. Malheureusement, il n'y as pas une solution unique et chaque stratégie a ses avantages et limitations. C'est très difficile d'évaluer tôt dans le design flow les couts et l'impact sur la performance. Donc, une méthodologie d'exploration de la résilience aux fautes est proposée pour les NoC 3D mesh.3D technology promises energy-efficient heterogeneous integrated systems, which may open the way to thousands cores chips. Silicon dies containing processing elements are stacked and connected by vertical wires called Through-Silicon-Vias. In 3D chips, interconnecting an increasing number of processing elements requires a scalable high-performance interconnect solution: the 3D Network-on-Chip. Despite the advantages of 3D integration, testing, reliability and yield remain the major challenges for 3D NoC-based systems. In this thesis, the TSV interconnect test issue is addressed by an off-line Interconnect Built-In Self-Test (IBIST) strategy that detects both structural (i.e. opens, shorts) and parametric faults (i.e. delays and delay due to crosstalk). The IBIST circuitry implements a novel algorithm based on the aggressor-victim scenario and alleviates limitations of existing strategies. The proposed Kth-aggressor fault (KAF) model assumes that the aggressors of a victim TSV are neighboring wires within a distance given by the aggressor order K. Using this model, TSV interconnect tests of inter-die 3D NoC links may be performed for different aggressor order, reducing test times and circuitry complexity. In 3D NoCs, TSV permanent and transient faults can be mitigated at different abstraction levels. In this thesis, several error resilience schemes are proposed at data link and network levels. For transient faults, 3D NoC links can be protected using error correction codes (ECC) and retransmission schemes using error detection (Automatic Retransmission Query) and correction codes (i.e. Hybrid error correction and retransmission).For transients along a source-destination path, ECC codes can be implemented at network level (i.e. Network-level Forward Error Correction). Data link solutions also include TSV repair schemes for faults due to fabrication processes (i.e. TSV-Spare-and-Replace and Configurable Serial Links) and aging (i.e. Interconnect Built-In Self-Repair and Adaptive Serialization) defects. At network-level, the faulty inter-die links of 3D mesh NoCs are repaired by implementing a TSV fault-tolerant routing algorithm. Although single-level solutions can achieve the desired yield / reliability targets, error mitigation can be realized by a combination of approaches at several abstraction levels. To this end, multi-level error resilience strategies have been proposed. Experimental results show that there are cases where this multi-layer strategy pays-off both in terms of cost and performance. Unfortunately, one-fits-all solution does not exist, as each strategy has its advantages and limitations. For system designers, it is very difficult to assess early in the design stages the costs and the impact on performance of error resilience. Therefore, an error resilience exploration (ERX) methodology is proposed for 3D NoCs.SAVOIE-SCD - Bib.électronique (730659901) / SudocGRENOBLE1/INP-Bib.électronique (384210012) / SudocGRENOBLE2/3-Bib.électronique (384219901) / SudocSudocFranceF
    corecore