126 research outputs found

    Innovative Techniques for Testing and Diagnosing SoCs

    Get PDF
    We rely upon the continued functioning of many electronic devices for our everyday welfare, usually embedding integrated circuits that are becoming even cheaper and smaller with improved features. Nowadays, microelectronics can integrate a working computer with CPU, memories, and even GPUs on a single die, namely System-On-Chip (SoC). SoCs are also employed on automotive safety-critical applications, but need to be tested thoroughly to comply with reliability standards, in particular the ISO26262 functional safety for road vehicles. The goal of this PhD. thesis is to improve SoC reliability by proposing innovative techniques for testing and diagnosing its internal modules: CPUs, memories, peripherals, and GPUs. The proposed approaches in the sequence appearing in this thesis are described as follows: 1. Embedded Memory Diagnosis: Memories are dense and complex circuits which are susceptible to design and manufacturing errors. Hence, it is important to understand the fault occurrence in the memory array. In practice, the logical and physical array representation differs due to an optimized design which adds enhancements to the device, namely scrambling. This part proposes an accurate memory diagnosis by showing the efforts of a software tool able to analyze test results, unscramble the memory array, map failing syndromes to cell locations, elaborate cumulative analysis, and elaborate a final fault model hypothesis. Several SRAM memory failing syndromes were analyzed as case studies gathered on an industrial automotive 32-bit SoC developed by STMicroelectronics. The tool displayed defects virtually, and results were confirmed by real photos taken from a microscope. 2. Functional Test Pattern Generation: The key for a successful test is the pattern applied to the device. They can be structural or functional; the former usually benefits from embedded test modules targeting manufacturing errors and is only effective before shipping the component to the client. The latter, on the other hand, can be applied during mission minimally impacting on performance but is penalized due to high generation time. However, functional test patterns may benefit for having different goals in functional mission mode. Part III of this PhD thesis proposes three different functional test pattern generation methods for CPU cores embedded in SoCs, targeting different test purposes, described as follows: a. Functional Stress Patterns: Are suitable for optimizing functional stress during I Operational-life Tests and Burn-in Screening for an optimal device reliability characterization b. Functional Power Hungry Patterns: Are suitable for determining functional peak power for strictly limiting the power of structural patterns during manufacturing tests, thus reducing premature device over-kill while delivering high test coverage c. Software-Based Self-Test Patterns: Combines the potentiality of structural patterns with functional ones, allowing its execution periodically during mission. In addition, an external hardware communicating with a devised SBST was proposed. It helps increasing in 3% the fault coverage by testing critical Hardly Functionally Testable Faults not covered by conventional SBST patterns. An automatic functional test pattern generation exploiting an evolutionary algorithm maximizing metrics related to stress, power, and fault coverage was employed in the above-mentioned approaches to quickly generate the desired patterns. The approaches were evaluated on two industrial cases developed by STMicroelectronics; 8051-based and a 32-bit Power Architecture SoCs. Results show that generation time was reduced upto 75% in comparison to older methodologies while increasing significantly the desired metrics. 3. Fault Injection in GPGPU: Fault injection mechanisms in semiconductor devices are suitable for generating structural patterns, testing and activating mitigation techniques, and validating robust hardware and software applications. GPGPUs are known for fast parallel computation used in high performance computing and advanced driver assistance where reliability is the key point. Moreover, GPGPU manufacturers do not provide design description code due to content secrecy. Therefore, commercial fault injectors using the GPGPU model is unfeasible, making radiation tests the only resource available, but are costly. In the last part of this thesis, we propose a software implemented fault injector able to inject bit-flip in memory elements of a real GPGPU. It exploits a software debugger tool and combines the C-CUDA grammar to wisely determine fault spots and apply bit-flip operations in program variables. The goal is to validate robust parallel algorithms by studying fault propagation or activating redundancy mechanisms they possibly embed. The effectiveness of the tool was evaluated on two robust applications: redundant parallel matrix multiplication and floating point Fast Fourier Transform

    New techniques for functional testing of microprocessor based systems

    Get PDF
    Electronic devices may be affected by failures, for example due to physical defects. These defects may be introduced during the manufacturing process, as well as during the normal operating life of the device due to aging. How to detect all these defects is not a trivial task, especially in complex systems such as processor cores. Nevertheless, safety-critical applications do not tolerate failures, this is the reason why testing such devices is needed so to guarantee a correct behavior at any time. Moreover, testing is a key parameter for assessing the quality of a manufactured product. Consolidated testing techniques are based on special Design for Testability (DfT) features added in the original design to facilitate test effectiveness. Design, integration, and usage of the available DfT for testing purposes are fully supported by commercial EDA tools, hence approaches based on DfT are the standard solutions adopted by silicon vendors for testing their devices. Tests exploiting the available DfT such as scan-chains manipulate the internal state of the system, differently to the normal functional mode, passing through unreachable configurations. Alternative solutions that do not violate such functional mode are defined as functional tests. In microprocessor based systems, functional testing techniques include software-based self-test (SBST), i.e., a piece of software (referred to as test program) which is uploaded in the system available memory and executed, with the purpose of exciting a specific part of the system and observing the effects of possible defects affecting it. SBST has been widely-studies by the research community for years, but its adoption by the industry is quite recent. My research activities have been mainly focused on the industrial perspective of SBST. The problem of providing an effective development flow and guidelines for integrating SBST in the available operating systems have been tackled and results have been provided on microprocessor based systems for the automotive domain. Remarkably, new algorithms have been also introduced with respect to state-of-the-art approaches, which can be systematically implemented to enrich SBST suites of test programs for modern microprocessor based systems. The proposed development flow and algorithms are being currently employed in real electronic control units for automotive products. Moreover, a special hardware infrastructure purposely embedded in modern devices for interconnecting the numerous on-board instruments has been interest of my research as well. This solution is known as reconfigurable scan networks (RSNs) and its practical adoption is growing fast as new standards have been created. Test and diagnosis methodologies have been proposed targeting specific RSN features, aimed at checking whether the reconfigurability of such networks has not been corrupted by defects and, in this case, at identifying the defective elements of the network. The contribution of my work in this field has also been included in the first suite of public-domain benchmark networks

    Observability Driven Path Generation for Delay Test

    Get PDF
    This research describes an approach for path generation using an observability metric for delay test. K Longest Path Per Gate (KLPG) tests are generated for sequential circuits. A transition launched from a scan flip-flop (SFF) is captured into another SFF during at-speed clock cycles, that is, clock cycles at the rated design speed. The generated path is a ‘longest path’ suitable for delay test. The path generation algorithm then utilizes observability of the fan-out gates in the consecutive, lower-speed clock cycles, known as coda cycles, to generate paths ending at a SFF, to capture the transition from the at-speed cycles. For a given clocking scheme defined by the number of coda cycles, if the final flip-flop is not scan-enabled, the path generation algorithm attempts to generate a different path that ends at a SFF, located in a different branch of the circuit fan-out, indicated by lower observability. The paths generated over multiple cycles are sequentially justified using Boolean satisfiability. The observability metric optimizes the path generation in the coda cycles by always attempting to grow the path through the branch with the best observability and never generating a path that ends at a non-scan flip-flop. The algorithm has been developed in C++. The experiments have been performed on an Intel Core i7 machine with 64GB RAM. Various ISCAS benchmark circuits have been used with various KLPG configurations for code evaluation. Multiple configurations have been used for the experiments. The combinations of the values of K [1, 2, 3, 4, 5] and number of coda cycles [1, 2, 3] have been used to characterize the implementation. A sublinear rise is run time has been observed with increasing K values. The total number of tested paths rise with K and falls with number of coda cycles, due to the increasing number of constraints on the path, particularly due to the fixed inputs

    High Quality Compact Delay Test Generation

    Get PDF
    Delay testing is used to detect timing defects and ensure that a circuit meets its timing specifications. The growing need for delay testing is a result of the advances in deep submicron (DSM) semiconductor technology and the increase in clock frequency. Small delay defects that previously were benign now produce delay faults, due to reduced timing margins. This research focuses on the development of new test methods for small delay defects, within the limits of affordable test generation cost and pattern count. First, a new dynamic compaction algorithm has been proposed to generate compacted test sets for K longest paths per gate (KLPG) in combinational circuits or scan-based sequential circuits. This algorithm uses a greedy approach to compact paths with non-conflicting necessary assignments together during test generation. Second, to make this dynamic compaction approach practical for industrial use, a recursive learning algorithm has been implemented to identify more necessary assignments for each path, so that the path-to-test-pattern matching using necessary assignments is more accurate. Third, a realistic low cost fault coverage metric targeting both global and local delay faults has been developed. The metric suggests the test strategy of generating a different number of longest paths for each line in the circuit while maintaining high fault coverage. The number of paths and type of test depends on the timing slack of the paths under this metric. Experimental results for ISCAS89 benchmark circuits and three industry circuits show that the pattern count of KLPG can be significantly reduced using the proposed methods. The pattern count is comparable to that of transition fault test, while achieving higher test quality. Finally, the proposed ATPG methodology has been applied to an industrial quad-core microprocessor. FMAX testing has been done on many devices and silicon data has shown the benefit of KLPG test

    Automatic test pattern generation for asynchronous circuits

    Get PDF
    The testability of integrated circuits becomes worse with transistor dimensions reaching nanometer scales. Testing, the process of ensuring that circuits are fabricated without defects, becomes inevitably part of the design process; a technique called design for test (DFT). Asynchronous circuits have a number of desirable properties making them suitable for the challenges posed by modern technologies, but are severely limited by the unavailability of EDA tools for DFT and automatic test-pattern generation (ATPG). This thesis is motivated towards developing test generation methodologies for asynchronous circuits. In total four methods were developed which are aimed at two different fault models: stuck-at faults at the basic logic gate level and transistor-level faults. The methods were evaluated using a set of benchmark circuits and compared favorably to previously published work. First, ABALLAST is a partial-scan DFT method adapting the well-known BALLAST technique for asynchronous circuits where balanced structures are used to guide the selection of the state-holding elements that will be scanned. The test inputs are automatically provided by a novel test pattern generator, which uses time frame unrolling to deal with the remaining, non-scanned sequential C-elements. The second method, called AGLOB, uses algorithms from strongly-connected components in graph graph theory as a method for finding the optimal position of breaking the loops in the asynchronous circuit and adding scan registers. The corresponding ATPG method converts cyclic circuits into acyclic for which standard tools can provide test patterns. These patterns are then automatically converted for use in the original cyclic circuits. The third method, ASCP, employs a new cycle enumeration method to find the loops present in a circuit. Enumerated cycles are then processed using an efficient set covering heuristic to select the scan elements for the circuit to be tested.Applying these methods to the benchmark circuits shows an improvement in fault coverage compared to previous work, which, for some circuits, was substantial. As no single method consistently outperforms the others in all benchmarks, they are all valuable as a designer’s suite of tools for testing. Moreover, since they are all scan-based, they are compatible and thus can be simultaneously used in different parts of a larger circuit. In the final method, ATRANTE, the main motivation of developing ATPG is supplemented by transistor level test generation. It is developed for asynchronous circuits designed using a State Transition Graph (STG) as their specification. The transistor-level circuit faults are efficiently mapped onto faults that modify the original STG. For each potential STG fault, the ATPG tool provides a sequence of test vectors that expose the difference in behavior to the output ports. The fault coverage obtained was 52-72 % higher than the coverage obtained using the gate level tests. Overall, four different design for test (DFT) methods for automatic test pattern generation (ATPG) for asynchronous circuits at both gate and transistor level were introduced in this thesis. A circuit extraction method for representing the asynchronous circuits at a higher level of abstraction was also implemented. Developing new methods for the test generation of asynchronous circuits in this thesis facilitates the test generation for asynchronous designs using the CAD tools available for testing the synchronous designs. Lessons learned and the research questions raised due to this work will impact the future work to probe the possibilities of developing robust CAD tools for testing the future asynchronous designs

    Methodology to accelerate diagnostic coverage assessment: MADC

    Get PDF
    Tese (doutorado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Engenharia Elétrica, Florianópolis, 2016.Os veículos da atualidade vêm integrando um número crescente de eletrônica embarcada, com o objetivo de permitir uma experiência mais segura aos motoristas. Logo, a garantia da segurança física é um requisito que precisa ser observada por completo durante o processo de desenvolvimento. O padrão ISO 26262 provê medidas para garantir que esses requisitos não sejam negligenciados. Injeção de falhas é fortemente recomendada quando da verificação do funcionamento dos mecanismos de segurança implementados, assim como sua capacidade de cobertura associada ao diagnóstico de falhas existentes. A análise exaustiva não é obrigatória, mas evidências de que o máximo esforço foi feito para acurar a cobertura de diagnóstico precisam ser apresentadas, principalmente durante a avalição dos níveis de segurança associados a arquitetura implementada em hardware. Estes níveis dão suporte às alegações de que o projeto obedece às métricas de segurança da integridade física exigida em aplicações automotivas. Os níveis de integridade variam de A à D, sendo este último o mais rigoroso. Essa Tese explora o estado-da-arte em soluções de verificação, e tem por objetivo construir uma metodologia que permita acelerar a verificação da cobertura de diagnóstico alcançado. Diferentemente de outras técnicas voltadas à aceleração de injeção de falhas, a metodologia proposta utiliza uma plataforma de hardware dedicada à verificação, com o intuito de maximizar o desempenho relativo a simulação de falhas. Muitos aspectos relativos a ISO 26262 são observados de forma que a presente contribuição possa ser apreciada no segmento automotivo. Por fim, uma arquitetura OpenRISC é utilizada para confirmar os resultados alcançados com essa solução proposta pertencente ao estado-da-arte.Abstract : Modern vehicles are integrating a growing number of electronics to provide a safer experience for the driver. Therefore, safety is a non-negotiable requirement that must be considered through the vehicle development process. The ISO 26262 standard provides guidance to ensure that such requirements are implemented. Fault injection is highly recommended for the functional verification of safety mechanisms or to evaluate their diagnostic coverage capability. An exhaustive analysis is not required, but evidence of best effort through the diagnostic coverage assessment needs to be provided when performing quantitative evaluation of hardware architectural metrics. These metrics support that the automotive safety integrity level ? ranging from A (lowest) to D (strictest) levels ? was obeyed. This thesis explores the most advanced verification solutions in order to build a methodology to accelerate the diagnostic coverage assessment. Different from similar techniques for fault injection acceleration, the proposed methodology does not require any modification of the design model to enable acceleration. Many functional safety requisites in the ISO 26262 are considered thus allowing the contribution presented to be a suitable solution for the automotive segment. An OpenRISC architecture is used to confirm the results achieved by this state-of-the-art solution

    FPGA Based Design for Accelerated Fault-testing of Integrated Circuits

    Get PDF
    In the past few decades, integrated circuits have become a major part of everyday life. Every circuit that is created needs to be tested for faults so faulty circuits are not sent to end-users. The creation of these tests is time consuming, costly and difficult to perform on larger circuits. This research presents a novel method for fault detection and test pattern reduction in integrated circuitry under test. By leveraging the FPGA\u27s reconfigurability and parallel processing capabilities, a speed up in fault detection can be achieved over previous computer simulation techniques. This work presents the following contributions to the field of Stuck-At-Fault detection: We present a new method for inserting faults into a circuit net list. Given any circuit netlist, our tool can insert multiplexers into a circuit at correct internal nodes to aid in fault emulation on reconfigurable hardware. We present a parallel method of fault emulation. The benefit of the FPGA is not only its ability to implement any circuit, but its ability to process data in parallel. This research utilizes this to create a more efficient emulation method that implements numerous copies of the same circuit in the FPGA. A new method to organize the most efficient faults. Most methods for determinin the minimum number of inputs to cover the most faults require sophisticated softwareprograms that use heuristics. By utilizing hardware, this research is able to process data faster and use a simpler method for an efficient way of minimizing inputs

    New Techniques for On-line Testing and Fault Mitigation in GPUs

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Resilience of an embedded architecture using hardware redundancy

    Get PDF
    In the last decade the dominance of the general computing systems market has being replaced by embedded systems with billions of units manufactured every year. Embedded systems appear in contexts where continuous operation is of utmost importance and failure can be profound. Nowadays, radiation poses a serious threat to the reliable operation of safety-critical systems. Fault avoidance techniques, such as radiation hardening, have been commonly used in space applications. However, these components are expensive, lag behind commercial components with regards to performance and do not provide 100% fault elimination. Without fault tolerant mechanisms, many of these faults can become errors at the application or system level, which in turn, can result in catastrophic failures. In this work we study the concepts of fault tolerance and dependability and extend these concepts providing our own definition of resilience. We analyse the physics of radiation-induced faults, the damage mechanisms of particles and the process that leads to computing failures. We provide extensive taxonomies of 1) existing fault tolerant techniques and of 2) the effects of radiation in state-of-the-art electronics, analysing and comparing their characteristics. We propose a detailed model of faults and provide a classification of the different types of faults at various levels. We introduce an algorithm of fault tolerance and define the system states and actions necessary to implement it. We introduce novel hardware and system software techniques that provide a more efficient combination of reliability, performance and power consumption than existing techniques. We propose a new element of the system called syndrome that is the core of a resilient architecture whose software and hardware can adapt to reliable and unreliable environments. We implement a software simulator and disassembler and introduce a testing framework in combination with ERA’s assembler and commercial hardware simulators

    Fault Tolerant Electronic System Design

    Get PDF
    Due to technology scaling, which means reduced transistor size, higher density, lower voltage and more aggressive clock frequency, VLSI devices may become more sensitive against soft errors. Especially for those devices used in safety- and mission-critical applications, dependability and reliability are becoming increasingly important constraints during the development of system on/around them. Other phenomena (e.g., aging and wear-out effects) also have negative impacts on reliability of modern circuits. Recent researches show that even at sea level, radiation particles can still induce soft errors in electronic systems. On one hand, processor-based system are commonly used in a wide variety of applications, including safety-critical and high availability missions, e.g., in the automotive, biomedical and aerospace domains. In these fields, an error may produce catastrophic consequences. Thus, dependability is a primary target that must be achieved taking into account tight constraints in terms of cost, performance, power and time to market. With standards and regulations (e.g., ISO-26262, DO-254, IEC-61508) clearly specify the targets to be achieved and the methods to prove their achievement, techniques working at system level are particularly attracting. On the other hand, Field Programmable Gate Array (FPGA) devices are becoming more and more attractive, also in safety- and mission-critical applications due to the high performance, low power consumption and the flexibility for reconfiguration they provide. Two types of FPGAs are commonly used, based on their configuration memory cell technology, i.e., SRAM-based and Flash-based FPGA. For SRAM-based FPGAs, the SRAM cells of the configuration memory highly susceptible to radiation induced effects which can leads to system failure; and for Flash-based FPGAs, even though their non-volatile configuration memory cells are almost immune to Single Event Upsets induced by energetic particles, the floating gate switches and the logic cells in the configuration tiles can still suffer from Single Event Effects when hit by an highly charged particle. So analysis and mitigation techniques for Single Event Effects on FPGAs are becoming increasingly important in the design flow especially when reliability is one of the main requirements
    • …
    corecore