916 research outputs found

    Reasoning and Improving on Software Resilience against Unanticipated Exceptions

    Get PDF
    In software, there are the errors anticipated at specification and design time, those encountered at development and testing time, and those that happen in production mode yet never anticipated. In this paper, we aim at reasoning on the ability of software to correctly handle unanticipated exceptions. We propose an algorithm, called short-circuit testing, which injects exceptions during test suite execution so as to simulate unanticipated errors. This algorithm collects data that is used as input for verifying two formal exception contracts that capture two resilience properties. Our evaluation on 9 test suites, with 78% line coverage in average, analyzes 241 executed catch blocks, shows that 101 of them expose resilience properties and that 84 can be transformed to be more resilient

    Experimental analysis of computer system dependability

    Get PDF
    This paper reviews an area which has evolved over the past 15 years: experimental analysis of computer system dependability. Methodologies and advances are discussed for three basic approaches used in the area: simulated fault injection, physical fault injection, and measurement-based analysis. The three approaches are suited, respectively, to dependability evaluation in the three phases of a system's life: design phase, prototype phase, and operational phase. Before the discussion of these phases, several statistical techniques used in the area are introduced. For each phase, a classification of research methods or study topics is outlined, followed by discussion of these methods or topics as well as representative studies. The statistical techniques introduced include the estimation of parameters and confidence intervals, probability distribution characterization, and several multivariate analysis methods. Importance sampling, a statistical technique used to accelerate Monte Carlo simulation, is also introduced. The discussion of simulated fault injection covers electrical-level, logic-level, and function-level fault injection methods as well as representative simulation environments such as FOCUS and DEPEND. The discussion of physical fault injection covers hardware, software, and radiation fault injection methods as well as several software and hybrid tools including FIAT, FERARI, HYBRID, and FINE. The discussion of measurement-based analysis covers measurement and data processing techniques, basic error characterization, dependency analysis, Markov reward modeling, software-dependability, and fault diagnosis. The discussion involves several important issues studies in the area, including fault models, fast simulation techniques, workload/failure dependency, correlated failures, and software fault tolerance

    A study of the relationship between the performance and dependability of a fault-tolerant computer

    Get PDF
    This thesis studies the relationship by creating a tool (FTAPE) that integrates a high stress workload generator with fault injection and by using the tool to evaluate system performance under error conditions. The workloads are comprised of processes which are formed from atomic components that represent CPU, memory, and I/O activity. The fault injector is software-implemented and is capable of injecting any memory addressable location, including special registers and caches. This tool has been used to study a Tandem Integrity S2 Computer. Workloads with varying numbers of processes and varying compositions of CPU, memory, and I/O activity are first characterized in terms of performance. Then faults are injected into these workloads. The results show that as the number of concurrent processes increases, the mean fault latency initially increases due to increased contention for the CPU. However, for even higher numbers of processes (less than 3 processes), the mean latency decreases because long latency faults are paged out before they can be activated

    SEU effect analysis in a open-source router via a distributed fault injection environment

    Get PDF
    The paper presents a detailed error analysis and classification of the behavior of an open-source router when affected by Single Event Upsets (SEUs). The experimental results have been gathered on a real communication network, resorting to an ad-hoc Fault Injection system. The injector has been designed to corrupt the router during its normal service and to analyze the SEU injection effects on the overall distributed system. The performed experiments allowed the authors to identify the most critical memory regions and to cluster the router variables according to their impact on system dependability

    Sustainable Fault-handling Of Reconfigurable Logic Using Throughput-driven Assessment

    Get PDF
    A sustainable Evolvable Hardware (EH) system is developed for SRAM-based reconfigurable Field Programmable Gate Arrays (FPGAs) using outlier detection and group testing-based assessment principles. The fault diagnosis methods presented herein leverage throughput-driven, relative fitness assessment to maintain resource viability autonomously. Group testing-based techniques are developed for adaptive input-driven fault isolation in FPGAs, without the need for exhaustive testing or coding-based evaluation. The techniques maintain the device operational, and when possible generate validated outputs throughout the repair process. Adaptive fault isolation methods based on discrepancy-enabled pair-wise comparisons are developed. By observing the discrepancy characteristics of multiple Concurrent Error Detection (CED) configurations, a method for robust detection of faults is developed based on pairwise parallel evaluation using Discrepancy Mirror logic. The results from the analytical FPGA model are demonstrated via a self-healing, self-organizing evolvable hardware system. Reconfigurability of the SRAM-based FPGA is leveraged to identify logic resource faults which are successively excluded by group testing using alternate device configurations. This simplifies the system architect\u27s role to definition of functionality using a high-level Hardware Description Language (HDL) and system-level performance versus availability operating point. System availability, throughput, and mean time to isolate faults are monitored and maintained using an Observer-Controller model. Results are demonstrated using a Data Encryption Standard (DES) core that occupies approximately 305 FPGA slices on a Xilinx Virtex-II Pro FPGA. With a single simulated stuck-at-fault, the system identifies a completely validated replacement configuration within three to five positive tests. The approach demonstrates a readily-implemented yet robust organic hardware application framework featuring a high degree of autonomous self-control

    Prototype of running clinical trials in an untrustworthy environment using blockchain.

    Get PDF
    Monitoring and ensuring the integrity of data within the clinical trial process is currently not always feasible with the current research system. We propose a blockchain-based system to make data collected in the clinical trial process immutable, traceable, and potentially more trustworthy. We use raw data from a real completed clinical trial, simulate the trial onto a proof of concept web portal service, and test its resilience to data tampering. We also assess its prospects to provide a traceable and useful audit trail of trial data for regulators, and a flexible service for all members within the clinical trials network. We also improve the way adverse events are currently reported. In conclusion, we advocate that this service could offer an improvement in clinical trial data management, and could bolster trust in the clinical research process and the ease at which regulators can oversee trials

    Measuring fault tolerance with the FTAPE fault injection tool

    Get PDF
    This paper describes FTAPE (Fault Tolerance And Performance Evaluator), a tool that can be used to compare fault-tolerant computers. The major parts of the tool include a system-wide fault-injector, a workload generator, and a workload activity measurement tool. The workload creates high stress conditions on the machine. Using stress-based injection, the fault injector is able to utilize knowledge of the workload activity to ensure a high level of fault propagation. The errors/fault ratio, performance degradation, and number of system crashes are presented as measures of fault tolerance

    Injecting software faults in Python applications

    Get PDF
    As técnicas de injeção de falhas de software têm sido amplamente utilizadas como meio para avaliar a confiabilidade de sistemas na presença de certos tipos de falhas. Apesar da grande diversidade de ferramentas que oferecem a possibilidade de emular a presença de falhas de software, há pouco suporte prático para emular a presença de falhas de soft ware em aplicações Python, que cada vez mais são usados para suportar serviços cloud críticos para negócios. Nesta tese, apresentamos uma ferramenta (de nome Fit4Python) para injetar falhas de software em código Python e, de seguida, usamo-la para analisar a eficácia da bateria de testes do OpenStack contra estas novas, prováveis, falhas de software. Começamos por analisar os tipos de falhas que afetam o Nova Compute, um componente central do OpenStack. Usamos a nossa ferramenta para emular a presença de novas falhas na API Nova Compute de forma a entender como a bateria de testes unitários, funcionais e de integração do OpenStack cobre essas novas, mas prováveis, situações. Os resultados mostram limitações claras na eficácia da bateria de testes dos programadores do Open Stack, com muitos casos de falhas injetadas a passarem sem serem detectadas por todos os três tipos de testes. Para além disto, observamos que que a maioria dos problemas analisados poderia ser detectada com mudanças ou acréscimos triviais aos testes unitários
    • …
    corecore