30,471 research outputs found

    Single-Event Upset Analysis and Protection in High Speed Circuits

    Get PDF
    The effect of single-event transients (SETs) (at a combinational node of a design) on the system reliability is becoming a big concern for ICs manufactured using advanced technologies. An SET at a node of combinational part may cause a transient pulse at the input of a flip-flop and consequently is latched in the flip-flop and generates a soft-error. When an SET conjoined with a transition at a node along a critical path of the combinational part of a design, a transient delay fault may occur at the input of a flip-flop. On the other hand, increasing pipeline depth and using low power techniques such as multi-level power supply, and multi-threshold transistor convert almost all paths in a circuit to critical ones. Thus, studying the behavior of the SET in these kinds of circuits needs special attention. This paper studies the dynamic behavior of a circuit with massive critical paths in the presence of an SET. We also propose a novel flip-flop architecture to mitigate the effects of such SETs in combinational circuits. Furthermore, the proposed architecture can tolerant a single event upset (SEU) caused by particle strike on the internal nodes of a flip-flo

    Experimental analysis of computer system dependability

    Get PDF
    This paper reviews an area which has evolved over the past 15 years: experimental analysis of computer system dependability. Methodologies and advances are discussed for three basic approaches used in the area: simulated fault injection, physical fault injection, and measurement-based analysis. The three approaches are suited, respectively, to dependability evaluation in the three phases of a system's life: design phase, prototype phase, and operational phase. Before the discussion of these phases, several statistical techniques used in the area are introduced. For each phase, a classification of research methods or study topics is outlined, followed by discussion of these methods or topics as well as representative studies. The statistical techniques introduced include the estimation of parameters and confidence intervals, probability distribution characterization, and several multivariate analysis methods. Importance sampling, a statistical technique used to accelerate Monte Carlo simulation, is also introduced. The discussion of simulated fault injection covers electrical-level, logic-level, and function-level fault injection methods as well as representative simulation environments such as FOCUS and DEPEND. The discussion of physical fault injection covers hardware, software, and radiation fault injection methods as well as several software and hybrid tools including FIAT, FERARI, HYBRID, and FINE. The discussion of measurement-based analysis covers measurement and data processing techniques, basic error characterization, dependency analysis, Markov reward modeling, software-dependability, and fault diagnosis. The discussion involves several important issues studies in the area, including fault models, fast simulation techniques, workload/failure dependency, correlated failures, and software fault tolerance

    Integrated analysis of error detection and recovery

    Get PDF
    An integrated modeling and analysis of error detection and recovery is presented. When fault latency and/or error latency exist, the system may suffer from multiple faults or error propagations which seriously deteriorate the fault-tolerant capability. Several detection models that enable analysis of the effect of detection mechanisms on the subsequent error handling operations and the overall system reliability were developed. Following detection of the faulty unit and reconfiguration of the system, the contaminated processes or tasks have to be recovered. The strategies of error recovery employed depend on the detection mechanisms and the available redundancy. Several recovery methods including the rollback recovery are considered. The recovery overhead is evaluated as an index of the capabilities of the detection and reconfiguration mechanisms

    Metastability-Containing Circuits

    Get PDF
    In digital circuits, metastability can cause deteriorated signals that neither are logical 0 or logical 1, breaking the abstraction of Boolean logic. Unfortunately, any way of reading a signal from an unsynchronized clock domain or performing an analog-to-digital conversion incurs the risk of a metastable upset; no digital circuit can deterministically avoid, resolve, or detect metastability (Marino, 1981). Synchronizers, the only traditional countermeasure, exponentially decrease the odds of maintained metastability over time. Trading synchronization delay for an increased probability to resolve metastability to logical 0 or 1, they do not guarantee success. We propose a fundamentally different approach: It is possible to contain metastability by fine-grained logical masking so that it cannot infect the entire circuit. This technique guarantees a limited degree of metastability in---and uncertainty about---the output. At the heart of our approach lies a time- and value-discrete model for metastability in synchronous clocked digital circuits. Metastability is propagated in a worst-case fashion, allowing to derive deterministic guarantees, without and unlike synchronizers. The proposed model permits positive results and passes the test of reproducing Marino's impossibility results. We fully classify which functions can be computed by circuits with standard registers. Regarding masking registers, we show that they become computationally strictly more powerful with each clock cycle, resulting in a non-trivial hierarchy of computable functions

    Cause Clue Clauses: Error Localization using Maximum Satisfiability

    Full text link
    Much effort is spent everyday by programmers in trying to reduce long, failing execution traces to the cause of the error. We present a new algorithm for error cause localization based on a reduction to the maximal satisfiability problem (MAX-SAT), which asks what is the maximum number of clauses of a Boolean formula that can be simultaneously satisfied by an assignment. At an intuitive level, our algorithm takes as input a program and a failing test, and comprises the following three steps. First, using symbolic execution, we encode a trace of a program as a Boolean trace formula which is satisfiable iff the trace is feasible. Second, for a failing program execution (e.g., one that violates an assertion or a post-condition), we construct an unsatisfiable formula by taking the trace formula and additionally asserting that the input is the failing test and that the assertion condition does hold at the end. Third, using MAX-SAT, we find a maximal set of clauses in this formula that can be satisfied together, and output the complement set as a potential cause of the error. We have implemented our algorithm in a tool called bug-assist for C programs. We demonstrate the surprising effectiveness of the tool on a set of benchmark examples with injected faults, and show that in most cases, bug-assist can quickly and precisely isolate the exact few lines of code whose change eliminates the error. We also demonstrate how our algorithm can be modified to automatically suggest fixes for common classes of errors such as off-by-one.Comment: The pre-alpha version of the tool can be downloaded from http://bugassist.mpi-sws.or
    • …
    corecore