Search CORE

771 research outputs found

A methodology for the generation of efficient error detection mechanisms

Author: Anand Sarabjot Singh
Arif Saima
Jhumka Arshad
Leeke Matthew
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2011
Field of study

A dependable software system must contain error detection mechanisms and error recovery mechanisms. Software components for the detection of errors are typically designed based on a system specification or the experience of software engineers, with their efficiency typically being measured using fault injection and metrics such as coverage and latency. In this paper, we introduce a methodology for the design of highly efficient error detection mechanisms. The proposed methodology combines fault injection analysis and data mining techniques in order to generate predicates for efficient error detection mechanisms. The results presented demonstrate the viability of the methodology as an approach for the development of efficient error detection mechanisms, as the predicates generated yield a true positive rate of almost 100% and a false positive rate very close to 0% for the detection of failure-inducing states. The main advantage of the proposed methodology over current state-of-the-art approaches is that efficient detectors are obtained by design, rather than by using specification-based detector design or the experience of software engineers

Warwick Research Archives Portal Repository

Beyond the golden run : evaluating the use of reference run models in fault injection analysis

Author: Jhumka Arshad
Leeke Matthew
Publication venue: University of Leeds. School of Computing
Publication date: 01/01/2009
Field of study

Fault injection (FI) has been shown to be an effective approach to assess- ing the dependability of software systems. To determine the impact of faults injected during FI, a given oracle is needed. This oracle can take a variety of forms, however prominent oracles include (i) specifications, (ii) error detection mechanisms and (iii) golden runs. Focusing on golden runs, in this paper we show that there are classes of software which a golden run based approach can not be used to analyse. Specifically we demonstrate that a golden run based approach can not be used when analysing systems which employ a main control loop with an irregular period. Further, we show how a simple model, which has been refined using FI, can be employed as an oracle in the analysis of such a system

CiteSeerX

Warwick Research Archives Portal Repository

Integration of formal fault analysis in ASSERT: Case studies and lessons learnt

Author: Bieber P.
Blanquart J,
Conquet E.
Durrieu G.
Lesens D
Lucotte J
Seguin C.
Tardy F.
Turin M
Publication venue: HAL CCSD
Publication date: 29/01/2008
Field of study

International audienceThe ASSERT European Integrated Project (Automated proof-based System and Software Engineering for Real-Time systems; EC FP6, IST-004033) has investigated, elaborated and experimented advanced methods based on the AltaRica language and support tool OCAS for architecture and fault approach propagation description analysis, and integrated in the complete ASSERT process. The paper describes lessons learnt from three case studies: safety critical spacecraft, autonomous deep exploration spacecraft, and civil aircraft

Recommended from our members

Benchmarking tests on recovery oriented computing

Author: Raman Nandita
Publication venue
Publication date: 09/07/2012
Field of study

textBenchmarks have played a very important role in guiding the progress of computer science systems in various ways. Specifically, in Autonomous environments it has a major role to play. System crashes and software failures are a basic part of a software system’s life-cycle and to overcome or rather make it as less vulnerable as possible is the main purpose of recovery oriented computing. This is usually done by trying to reduce the downtime by automatically and efficiently recovering from a broad class of transient software failures without having to modify applications. There have been various types of benchmarks for recovering from a failure, but in this paper we intend to create a benchmark framework called the warning benchmarks to measure and evaluate the recovery oriented systems. It consists of the known and the unknown failures and few benchmark techniques which the warning benchmarks handle with the help of various other techniques in software fault analysis.Electrical and Computer Engineerin

Texas ScholarWorks

An Experimental Evaluation of the REE SIFT Environment for Spaceborne Applications

Author: Iyer R.K.
Jones P.
Kalbarczyk Z.
Whisnant K.
Publication venue: Coordinated Science Laboratory, University of Illinois at Urbana-Champaign
Publication date: 01/04/2002
Field of study

Coordinated Science Laboratory was formerly known as Control Systems Laborator

Illinois Digital Environment for Access to Learning and Scholarship Repository

Recommended from our members

Fault Tolerance Against Design Faults

Author: Strigini L.
Publication venue: 'Wiley'
Publication date: 01/01/2005
Field of study

City Research Online

An Adaptive Resilience Testing Framework for Microservice Systems

Author: Lee Cheryl
Lyu Michael R.
Shen Jiacheng
Su Yuxin
Yang Tianyi
Yang Yongqiang
Publication venue
Publication date: 24/12/2022
Field of study

Resilience testing, which measures the ability to minimize service degradation caused by unexpected failures, is crucial for microservice systems. The current practice for resilience testing relies on manually defining rules for different microservice systems. Due to the diverse business logic of microservices, there are no one-size-fits-all microservice resilience testing rules. As the quantity and dynamic of microservices and failures largely increase, manual configuration exhibits its scalability and adaptivity issues. To overcome the two issues, we empirically compare the impacts of common failures in the resilient and unresilient deployments of a benchmark microservice system. Our study demonstrates that the resilient deployment can block the propagation of degradation from system performance metrics (e.g., memory usage) to business metrics (e.g., response latency). In this paper, we propose AVERT, the first AdaptiVE Resilience Testing framework for microservice systems. AVERT first injects failures into microservices and collects available monitoring metrics. Then AVERT ranks all the monitoring metrics according to their contributions to the overall service degradation caused by the injected failures. Lastly, AVERT produces a resilience index by how much the degradation in system performance metrics propagates to the degradation in business metrics. The higher the degradation propagation, the lower the resilience of the microservice system. We evaluate AVERT on two open-source benchmark microservice systems. The experimental results show that AVERT can accurately and efficiently test the resilience of microservice systems

arXiv.org e-Print Archive