8,924 research outputs found

    Innovative Techniques for Testing and Diagnosing SoCs

    Get PDF
    We rely upon the continued functioning of many electronic devices for our everyday welfare, usually embedding integrated circuits that are becoming even cheaper and smaller with improved features. Nowadays, microelectronics can integrate a working computer with CPU, memories, and even GPUs on a single die, namely System-On-Chip (SoC). SoCs are also employed on automotive safety-critical applications, but need to be tested thoroughly to comply with reliability standards, in particular the ISO26262 functional safety for road vehicles. The goal of this PhD. thesis is to improve SoC reliability by proposing innovative techniques for testing and diagnosing its internal modules: CPUs, memories, peripherals, and GPUs. The proposed approaches in the sequence appearing in this thesis are described as follows: 1. Embedded Memory Diagnosis: Memories are dense and complex circuits which are susceptible to design and manufacturing errors. Hence, it is important to understand the fault occurrence in the memory array. In practice, the logical and physical array representation differs due to an optimized design which adds enhancements to the device, namely scrambling. This part proposes an accurate memory diagnosis by showing the efforts of a software tool able to analyze test results, unscramble the memory array, map failing syndromes to cell locations, elaborate cumulative analysis, and elaborate a final fault model hypothesis. Several SRAM memory failing syndromes were analyzed as case studies gathered on an industrial automotive 32-bit SoC developed by STMicroelectronics. The tool displayed defects virtually, and results were confirmed by real photos taken from a microscope. 2. Functional Test Pattern Generation: The key for a successful test is the pattern applied to the device. They can be structural or functional; the former usually benefits from embedded test modules targeting manufacturing errors and is only effective before shipping the component to the client. The latter, on the other hand, can be applied during mission minimally impacting on performance but is penalized due to high generation time. However, functional test patterns may benefit for having different goals in functional mission mode. Part III of this PhD thesis proposes three different functional test pattern generation methods for CPU cores embedded in SoCs, targeting different test purposes, described as follows: a. Functional Stress Patterns: Are suitable for optimizing functional stress during I Operational-life Tests and Burn-in Screening for an optimal device reliability characterization b. Functional Power Hungry Patterns: Are suitable for determining functional peak power for strictly limiting the power of structural patterns during manufacturing tests, thus reducing premature device over-kill while delivering high test coverage c. Software-Based Self-Test Patterns: Combines the potentiality of structural patterns with functional ones, allowing its execution periodically during mission. In addition, an external hardware communicating with a devised SBST was proposed. It helps increasing in 3% the fault coverage by testing critical Hardly Functionally Testable Faults not covered by conventional SBST patterns. An automatic functional test pattern generation exploiting an evolutionary algorithm maximizing metrics related to stress, power, and fault coverage was employed in the above-mentioned approaches to quickly generate the desired patterns. The approaches were evaluated on two industrial cases developed by STMicroelectronics; 8051-based and a 32-bit Power Architecture SoCs. Results show that generation time was reduced upto 75% in comparison to older methodologies while increasing significantly the desired metrics. 3. Fault Injection in GPGPU: Fault injection mechanisms in semiconductor devices are suitable for generating structural patterns, testing and activating mitigation techniques, and validating robust hardware and software applications. GPGPUs are known for fast parallel computation used in high performance computing and advanced driver assistance where reliability is the key point. Moreover, GPGPU manufacturers do not provide design description code due to content secrecy. Therefore, commercial fault injectors using the GPGPU model is unfeasible, making radiation tests the only resource available, but are costly. In the last part of this thesis, we propose a software implemented fault injector able to inject bit-flip in memory elements of a real GPGPU. It exploits a software debugger tool and combines the C-CUDA grammar to wisely determine fault spots and apply bit-flip operations in program variables. The goal is to validate robust parallel algorithms by studying fault propagation or activating redundancy mechanisms they possibly embed. The effectiveness of the tool was evaluated on two robust applications: redundant parallel matrix multiplication and floating point Fast Fourier Transform

    Advanced information processing system: The Army fault tolerant architecture conceptual study. Volume 2: Army fault tolerant architecture design and analysis

    Get PDF
    Described here is the Army Fault Tolerant Architecture (AFTA) hardware architecture and components and the operating system. The architectural and operational theory of the AFTA Fault Tolerant Data Bus is discussed. The test and maintenance strategy developed for use in fielded AFTA installations is presented. An approach to be used in reducing the probability of AFTA failure due to common mode faults is described. Analytical models for AFTA performance, reliability, availability, life cycle cost, weight, power, and volume are developed. An approach is presented for using VHSIC Hardware Description Language (VHDL) to describe and design AFTA's developmental hardware. A plan is described for verifying and validating key AFTA concepts during the Dem/Val phase. Analytical models and partial mission requirements are used to generate AFTA configurations for the TF/TA/NOE and Ground Vehicle missions

    Advanced information processing system for advanced launch system: Avionics architecture synthesis

    Get PDF
    The Advanced Information Processing System (AIPS) is a fault-tolerant distributed computer system architecture that was developed to meet the real time computational needs of advanced aerospace vehicles. One such vehicle is the Advanced Launch System (ALS) being developed jointly by NASA and the Department of Defense to launch heavy payloads into low earth orbit at one tenth the cost (per pound of payload) of the current launch vehicles. An avionics architecture that utilizes the AIPS hardware and software building blocks was synthesized for ALS. The AIPS for ALS architecture synthesis process starting with the ALS mission requirements and ending with an analysis of the candidate ALS avionics architecture is described

    Application of advanced technology to space automation

    Get PDF
    Automated operations in space provide the key to optimized mission design and data acquisition at minimum cost for the future. The results of this study strongly accentuate this statement and should provide further incentive for immediate development of specific automtion technology as defined herein. Essential automation technology requirements were identified for future programs. The study was undertaken to address the future role of automation in the space program, the potential benefits to be derived, and the technology efforts that should be directed toward obtaining these benefits

    Airborne Advanced Reconfigurable Computer System (ARCS)

    Get PDF
    A digital computer subsystem fault-tolerant concept was defined, and the potential benefits and costs of such a subsystem were assessed when used as the central element of a new transport's flight control system. The derived advanced reconfigurable computer system (ARCS) is a triple-redundant computer subsystem that automatically reconfigures, under multiple fault conditions, from triplex to duplex to simplex operation, with redundancy recovery if the fault condition is transient. The study included criteria development covering factors at the aircraft's operation level that would influence the design of a fault-tolerant system for commercial airline use. A new reliability analysis tool was developed for evaluating redundant, fault-tolerant system availability and survivability; and a stringent digital system software design methodology was used to achieve design/implementation visibility

    An Adaptive Design Methodology for Reduction of Product Development Risk

    Full text link
    Embedded systems interaction with environment inherently complicates understanding of requirements and their correct implementation. However, product uncertainty is highest during early stages of development. Design verification is an essential step in the development of any system, especially for Embedded System. This paper introduces a novel adaptive design methodology, which incorporates step-wise prototyping and verification. With each adaptive step product-realization level is enhanced while decreasing the level of product uncertainty, thereby reducing the overall costs. The back-bone of this frame-work is the development of Domain Specific Operational (DOP) Model and the associated Verification Instrumentation for Test and Evaluation, developed based on the DOP model. Together they generate functionally valid test-sequence for carrying out prototype evaluation. With the help of a case study 'Multimode Detection Subsystem' the application of this method is sketched. The design methodologies can be compared by defining and computing a generic performance criterion like Average design-cycle Risk. For the case study, by computing Average design-cycle Risk, it is shown that the adaptive method reduces the product development risk for a small increase in the total design cycle time.Comment: 21 pages, 9 figure

    On-Line Dependability Enhancement of Multiprocessor SoCs by Resource Management

    Get PDF
    This paper describes a new approach towards dependable design of homogeneous multi-processor SoCs in an example satellite-navigation application. First, the NoC dependability is functionally verified via embedded software. Then the Xentium processor tiles are periodically verified via on-line self-testing techniques, by using a new IIP Dependability Manager. Based on the Dependability Manager results, faulty tiles are electronically excluded and replaced by fault-free spare tiles via on-line resource management. This integrated approach enables fast electronic fault detection/diagnosis and repair, and hence a high system availability. The dependability application runs in parallel with the actual application, resulting in a very dependable system. All parts have been verified by simulation

    Space shuttle avionics system

    Get PDF
    The Space Shuttle avionics system, which was conceived in the early 1970's and became operational in the 1980's represents a significant advancement of avionics system technology in the areas of systems and redundacy management, digital data base technology, flight software, flight control integration, digital fly-by-wire technology, crew display interface, and operational concepts. The origins and the evolution of the system are traced; the requirements, the constraints, and other factors which led to the final configuration are outlined; and the functional operation of the system is described. An overall system block diagram is included

    Care 3, Phase 1, volume 1

    Get PDF
    A computer program to aid in accessing the reliability of fault tolerant avionics systems was developed. A simple mathematical expression was used to evaluate the reliability of any redundant configuration over any interval during which the failure rates and coverage parameters remained unaffected by configuration changes. Provision was made for convolving such expressions in order to evaluate the reliability of a dual mode system. A coverage model was also developed to determine the various relevant coverage coefficients as a function of the available hardware and software fault detector characteristics, and subsequent isolation and recovery delay statistics
    • …
    corecore