1,259 research outputs found

    Environmental-Based Characterization of SoC-Based Instrumentation Systems for Stratified Testing

    Get PDF
    This paper proposes a novel environmental-based method for evaluating the good yield rate (GYR) of systems-on-chip (SoC) during fabrication. Testing and yield evaluation at high confidence are two of the most critical issues for the success of SoC as a viable technology. The proposed method relies on different features of fabrication, which are quantified by the so-called Fabrication environmental parameters (EPs). EPs can be highly correlated to the yield, so they are analyzed using statistical methods to improve its accuracy and ultimately direct the test process to an efficient execution. The novel contributions of the proposed method are: 1) to establish an adequate theoretical foundation for understanding the fabrication process of SoCs together with an assurance of the yield at a high confidence level and 2) to ultimately provide a realistic approach to SoC testing with an accurate yield evaluation. Simulations are provided to demonstrate that the proposed method significantly improves the confidence interval of the estimated yield as compared with existing testing methodologies such as random testing (RT)

    Experimental analysis of computer system dependability

    Get PDF
    This paper reviews an area which has evolved over the past 15 years: experimental analysis of computer system dependability. Methodologies and advances are discussed for three basic approaches used in the area: simulated fault injection, physical fault injection, and measurement-based analysis. The three approaches are suited, respectively, to dependability evaluation in the three phases of a system's life: design phase, prototype phase, and operational phase. Before the discussion of these phases, several statistical techniques used in the area are introduced. For each phase, a classification of research methods or study topics is outlined, followed by discussion of these methods or topics as well as representative studies. The statistical techniques introduced include the estimation of parameters and confidence intervals, probability distribution characterization, and several multivariate analysis methods. Importance sampling, a statistical technique used to accelerate Monte Carlo simulation, is also introduced. The discussion of simulated fault injection covers electrical-level, logic-level, and function-level fault injection methods as well as representative simulation environments such as FOCUS and DEPEND. The discussion of physical fault injection covers hardware, software, and radiation fault injection methods as well as several software and hybrid tools including FIAT, FERARI, HYBRID, and FINE. The discussion of measurement-based analysis covers measurement and data processing techniques, basic error characterization, dependency analysis, Markov reward modeling, software-dependability, and fault diagnosis. The discussion involves several important issues studies in the area, including fault models, fast simulation techniques, workload/failure dependency, correlated failures, and software fault tolerance

    Cost modelling and concurrent engineering for testable design

    Get PDF
    This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.As integrated circuits and printed circuit boards increase in complexity, testing becomes a major cost factor of the design and production of the complex devices. Testability has to be considered during the design of complex electronic systems, and automatic test systems have to be used in order to facilitate the test. This fact is now widely accepted in industry. Both design for testability and the usage of automatic test systems aim at reducing the cost of production testing or, sometimes, making it possible at all. Many design for testability methods and test systems are available which can be configured into a production test strategy, in order to achieve high quality of the final product. The designer has to select from the various options for creating a test strategy, by maximising the quality and minimising the total cost for the electronic system. This thesis presents a methodology for test strategy generation which is based on consideration of the economics during the life cycle of the electronic system. This methodology is a concurrent engineering approach which takes into account all effects of a test strategy on the electronic system during its life cycle by evaluating its related cost. This objective methodology is used in an original test strategy planning advisory system, which allows for test strategy planning for VLSI circuits as well as for digital electronic systems. The cost models which are used for evaluating the economics of test strategies are described in detail and the test strategy planning system is presented. A methodology for making decisions which are based on estimated costing data is presented. Results of using the cost models and the test strategy planning system for evaluating the economics of test strategies for selected industrial designs are presented

    Quantifiable Assurance: From IPs to Platforms

    Get PDF
    Hardware vulnerabilities are generally considered more difficult to fix than software ones because they are persistent after fabrication. Thus, it is crucial to assess the security and fix the vulnerabilities at earlier design phases, such as Register Transfer Level (RTL) and gate level. The focus of the existing security assessment techniques is mainly twofold. First, they check the security of Intellectual Property (IP) blocks separately. Second, they aim to assess the security against individual threats considering the threats are orthogonal. We argue that IP-level security assessment is not sufficient. Eventually, the IPs are placed in a platform, such as a system-on-chip (SoC), where each IP is surrounded by other IPs connected through glue logic and shared/private buses. Hence, we must develop a methodology to assess the platform-level security by considering both the IP-level security and the impact of the additional parameters introduced during platform integration. Another important factor to consider is that the threats are not always orthogonal. Improving security against one threat may affect the security against other threats. Hence, to build a secure platform, we must first answer the following questions: What additional parameters are introduced during the platform integration? How do we define and characterize the impact of these parameters on security? How do the mitigation techniques of one threat impact others? This paper aims to answer these important questions and proposes techniques for quantifiable assurance by quantitatively estimating and measuring the security of a platform at the pre-silicon stages. We also touch upon the term security optimization and present the challenges for future research directions

    High Confidence Testing for Instrumentation System-on-Chip with Unknown-Good-Yield

    Get PDF
    SoCs are in general built with embedded IP cores, each of which is procured from different IP providers with no prior information on known-good-yield (KGY). In practice, partial testing is a practical choice for assuring the yield of the product under the stringent time-to-market requirements. Therefore, a proper sampling technique is a key to high confidence testing and cost effectiveness. Based on previous research, this paper proposes a novel statistical testing technique for increasingly hybrid integrated systems fabricated on a single silicon die with no a-priori empirical yield data. This problem is referred to as the unknown-good-yield (UKGY) problem. The proposed testing method, referred to as regressive testing (RegT) in this paper, exploits another way around by using parameters (referred to as assistant variables (AV)) that are employed to evaluate the yields of randomly sampled SoCs and thereby estimating the good yield by using a regression analysis method with regard to confidence intervals. Numerous simulations are conducted to demonstrate the efficiency and effectiveness of the proposed RegT in comparison to characterization-based testing methods

    Statistical Reliability Estimation of Microprocessor-Based Systems

    Get PDF
    What is the probability that the execution state of a given microprocessor running a given application is correct, in a certain working environment with a given soft-error rate? Trying to answer this question using fault injection can be very expensive and time consuming. This paper proposes the baseline for a new methodology, based on microprocessor error probability profiling, that aims at estimating fault injection results without the need of a typical fault injection setup. The proposed methodology is based on two main ideas: a one-time fault-injection analysis of the microprocessor architecture to characterize the probability of successful execution of each of its instructions in presence of a soft-error, and a static and very fast analysis of the control and data flow of the target software application to compute its probability of success. The presented work goes beyond the dependability evaluation problem; it also has the potential to become the backbone for new tools able to help engineers to choose the best hardware and software architecture to structurally maximize the probability of a correct execution of the target softwar

    Accurate Computation of Field Reject Ratio Based on Fault Latency

    Get PDF
    The field reject ratio, the fraction of defective devices that pass the acceptance test, is a measure of the quality of the tested product. Although the assessment of quality is important, an accurate measurement of the field reject ratio of tested VLSI chips is often not feasible. We show that the known methods of field reject ratio prediction are not accurate since they fail to realistically model the process of testing. We model the detection of a fault by an input test vector as a random event. However, we recognize that the detection of a fault may be delayed for various reasons: the fault may be detectable only by application of a sequence of vectors or it may not have been targeted until later. In our statistical model, a fault is characterized by two parameters: a per-vector detection probability and an integer-valued latency. Irrespective of the detection probability, the fault cannot be detected by a vector sequence shorter than its latency. The circuit is characterized by the joint distribution of latency and detection probability over all faults. This distribution, obtained by applying the Bayes’ rule to the actual test data, enables us to compute the field reject ratio. The sensitivity of this approach to variations in the measured parameters is also investigated

    Accurate Computation of Field Reject Ratio Based on Fault Latency

    Get PDF
    The field reject ratio, the fraction of defective devices that pass the acceptance test, is a measure of the quality of the tested product. Although the assessment of quality is important, an accurate measurement of the field reject ratio of tested VLSI chips is often not feasible. We show that the known methods of field reject ratio prediction are not accurate since they fail to realistically model the process of testing. We model the detection of a fault by an input test vector as a random event. However, we recognize that the detection of a fault may be delayed for various reasons: the fault may be detectable only by application of a sequence of vectors or it may not have been targeted until later. In our statistical model, a fault is characterized by two parameters: a per-vector detection probability and an integer-valued latency. Irrespective of the detection probability, the fault cannot be detected by a vector sequence shorter than its latency. The circuit is characterized by the joint distribution of latency and detection probability over all faults. This distribution, obtained by applying the Bayes’ rule to the actual test data, enables us to compute the field reject ratio. The sensitivity of this approach to variations in the measured parameters is also investigated

    AI/ML Algorithms and Applications in VLSI Design and Technology

    Full text link
    An evident challenge ahead for the integrated circuit (IC) industry in the nanometer regime is the investigation and development of methods that can reduce the design complexity ensuing from growing process variations and curtail the turnaround time of chip manufacturing. Conventional methodologies employed for such tasks are largely manual; thus, time-consuming and resource-intensive. In contrast, the unique learning strategies of artificial intelligence (AI) provide numerous exciting automated approaches for handling complex and data-intensive tasks in very-large-scale integration (VLSI) design and testing. Employing AI and machine learning (ML) algorithms in VLSI design and manufacturing reduces the time and effort for understanding and processing the data within and across different abstraction levels via automated learning algorithms. It, in turn, improves the IC yield and reduces the manufacturing turnaround time. This paper thoroughly reviews the AI/ML automated approaches introduced in the past towards VLSI design and manufacturing. Moreover, we discuss the scope of AI/ML applications in the future at various abstraction levels to revolutionize the field of VLSI design, aiming for high-speed, highly intelligent, and efficient implementations

    Susceptible Workload Evaluation and Protection using Selective Fault Tolerance

    Get PDF
    This is an Open Access article distributed under the terms of the Creative Commons Attribution International License CC-BY 4.0 ( http://creativecommons.org/licenses/by/4.0/ ), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.Low power fault tolerance design techniques trade reliability to reduce the area cost and the power overhead of integrated circuits by protecting only a subset of their workload or their most vulnerable parts. However, in the presence of faults not all workloads are equally susceptible to errors. In this paper, we present a low power fault tolerance design technique that selects and protects the most susceptible workload. We propose to rank the workload susceptibility as the likelihood of any error to bypass the logic masking of the circuit and propagate to its outputs. The susceptible workload is protected by a partial Triple Modular Redundancy (TMR) scheme. We evaluate the proposed technique on timing-independent and timing-dependent errors induced by permanent and transient faults. In comparison with unranked selective fault tolerance approach, we demonstrate a) a similar error coverage with a 39.7% average reduction of the area overhead or b) a 86.9% average error coverage improvement for a similar area overhead. For the same area overhead case, we observe an error coverage improvement of 53.1% and 53.5% against permanent stuck-at and transition faults, respectively, and an average error coverage improvement of 151.8% and 89.0% against timing-dependent and timing-independent transient faults, respectively. Compared to TMR, the proposed technique achieves an area and power overhead reduction of 145.8% to 182.0%.Peer reviewedFinal Published versio
    corecore