6,481 research outputs found

    Investigation of Air Transportation Technology at Princeton University, 1989-1990

    Get PDF
    The Air Transportation Technology Program at Princeton University proceeded along six avenues during the past year: microburst hazards to aircraft; machine-intelligent, fault tolerant flight control; computer aided heuristics for piloted flight; stochastic robustness for flight control systems; neural networks for flight control; and computer aided control system design. These topics are briefly discussed, and an annotated bibliography of publications that appeared between January 1989 and June 1990 is given

    Algorithm-Directed Crash Consistence in Non-Volatile Memory for HPC

    Full text link
    Fault tolerance is one of the major design goals for HPC. The emergence of non-volatile memories (NVM) provides a solution to build fault tolerant HPC. Data in NVM-based main memory are not lost when the system crashes because of the non-volatility nature of NVM. However, because of volatile caches, data must be logged and explicitly flushed from caches into NVM to ensure consistence and correctness before crashes, which can cause large runtime overhead. In this paper, we introduce an algorithm-based method to establish crash consistence in NVM for HPC applications. We slightly extend application data structures or sparsely flush cache blocks, which introduce ignorable runtime overhead. Such extension or cache flushing allows us to use algorithm knowledge to \textit{reason} data consistence or correct inconsistent data when the application crashes. We demonstrate the effectiveness of our method for three algorithms, including an iterative solver, dense matrix multiplication, and Monte-Carlo simulation. Based on comprehensive performance evaluation on a variety of test environments, we demonstrate that our approach has very small runtime overhead (at most 8.2\% and less than 3\% in most cases), much smaller than that of traditional checkpoint, while having the same or less recomputation cost.Comment: 12 page

    Supporting group maintenance through prognostics-enhanced dynamic dependability prediction

    Get PDF
    Condition-based maintenance strategies adapt maintenance planning through the integration of online condition monitoring of assets. The accuracy and cost-effectiveness of these strategies can be improved by integrating prognostics predictions and grouping maintenance actions respectively. In complex industrial systems, however, effective condition-based maintenance is intricate. Such systems are comprised of repairable assets which can fail in different ways, with various effects, and typically governed by dynamics which include time-dependent and conditional events. In this context, system reliability prediction is complex and effective maintenance planning is virtually impossible prior to system deployment and hard even in the case of condition-based maintenance. Addressing these issues, this paper presents an online system maintenance method that takes into account the system dynamics. The method employs an online predictive diagnosis algorithm to distinguish between critical and non-critical assets. A prognostics-updated method for predicting the system health is then employed to yield well-informed, more accurate, condition-based suggestions for the maintenance of critical assets and for the group-based reactive repair of non-critical assets. The cost-effectiveness of the approach is discussed in a case study from the power industry

    Fault-tolerant magic state preparation with flag qubits

    Get PDF
    Magic state distillation is one of the leading candidates for implementing universal fault-tolerant logical gates. However, the distillation circuits themselves are not fault-tolerant, so there is additional cost to first implement encoded Clifford gates with negligible error. In this paper we present a scheme to fault-tolerantly and directly prepare magic states using flag qubits. One of these schemes uses a single extra ancilla, even with noisy Clifford gates. We compare the physical qubit and gate cost of this scheme to the magic state distillation protocol of Meier, Eastin, and Knill, which is efficient and uses a small stabilizer circuit. In some regimes, we show that the overhead can be improved by several orders of magnitude.Comment: 26 pages, 17 figures, 5 tables. Comments welcome! v2 (published version): quantumarticle documentclass and expanded discussions on the fault-tolerant scheme

    Experimental analysis of computer system dependability

    Get PDF
    This paper reviews an area which has evolved over the past 15 years: experimental analysis of computer system dependability. Methodologies and advances are discussed for three basic approaches used in the area: simulated fault injection, physical fault injection, and measurement-based analysis. The three approaches are suited, respectively, to dependability evaluation in the three phases of a system's life: design phase, prototype phase, and operational phase. Before the discussion of these phases, several statistical techniques used in the area are introduced. For each phase, a classification of research methods or study topics is outlined, followed by discussion of these methods or topics as well as representative studies. The statistical techniques introduced include the estimation of parameters and confidence intervals, probability distribution characterization, and several multivariate analysis methods. Importance sampling, a statistical technique used to accelerate Monte Carlo simulation, is also introduced. The discussion of simulated fault injection covers electrical-level, logic-level, and function-level fault injection methods as well as representative simulation environments such as FOCUS and DEPEND. The discussion of physical fault injection covers hardware, software, and radiation fault injection methods as well as several software and hybrid tools including FIAT, FERARI, HYBRID, and FINE. The discussion of measurement-based analysis covers measurement and data processing techniques, basic error characterization, dependency analysis, Markov reward modeling, software-dependability, and fault diagnosis. The discussion involves several important issues studies in the area, including fault models, fast simulation techniques, workload/failure dependency, correlated failures, and software fault tolerance

    Techniques for the Fast Simulation of Models of Highly dependable Systems

    Get PDF
    With the ever-increasing complexity and requirements of highly dependable systems, their evaluation during design and operation is becoming more crucial. Realistic models of such systems are often not amenable to analysis using conventional analytic or numerical methods. Therefore, analysts and designers turn to simulation to evaluate these models. However, accurate estimation of dependability measures of these models requires that the simulation frequently observes system failures, which are rare events in highly dependable systems. This renders ordinary Simulation impractical for evaluating such systems. To overcome this problem, simulation techniques based on importance sampling have been developed, and are very effective in certain settings. When importance sampling works well, simulation run lengths can be reduced by several orders of magnitude when estimating transient as well as steady-state dependability measures. This paper reviews some of the importance-sampling techniques that have been developed in recent years to estimate dependability measures efficiently in Markov and nonMarkov models of highly dependable system
    • …
    corecore