36,554 research outputs found
Recommended from our members
Assessing Asymmetric Fault-Tolerant Software
The most popular forms of fault tolerance against design faults use "asymmetric" architectures in which a "primary" part performs the computation and a "secondary" part is in charge of detecting errors and performing some kind of error processing and recovery. In contrast, the most studied forms of software fault tolerance are "symmetric" ones, e.g. N-version programming. The latter are often controversial, the former are not. We discuss how to assess the dependability gains achieved by these methods. Substantial difficulties have been shown to exist for symmetric schemes, but we show that the same difficulties affect asymmetric schemes. Indeed, the latter present somewhat subtler problems. In both cases, to predict the dependability of the fault-tolerant system it is not enough to know the dependability of the individual components. We extend to asymmetric architectures the style of probabilistic modeling that has been useful for describing the dependability of "symmetric" architectures, to highlight factors that complicate the assessment. In the light of these models, we finally discuss fault injection approaches to estimating coverage factors. We highlight the limits of what can be predicted and some useful research directions towards clarifying and extending the range of situations in which estimates of coverage of fault tolerance mechanisms can be trusted
Correct and Control Complex IoT Systems: Evaluation of a Classification for System Anomalies
In practice there are deficiencies in precise interteam communications about
system anomalies to perform troubleshooting and postmortem analysis along
different teams operating complex IoT systems. We evaluate the quality in use
of an adaptation of IEEE Std. 1044-2009 with the objective to differentiate the
handling of fault detection and fault reaction from handling of defect and its
options for defect correction. We extended the scope of IEEE Std. 1044-2009
from anomalies related to software only to anomalies related to complex IoT
systems. To evaluate the quality in use of our classification a study was
conducted at Robert Bosch GmbH. We applied our adaptation to a postmortem
analysis of an IoT solution and evaluated the quality in use by conducting
interviews with three stakeholders. Our adaptation was effectively applied and
interteam communications as well as iterative and inductive learning for
product improvement were enhanced. Further training and practice are required.Comment: Submitted to QRS 2020 (IEEE Conference on Software Quality,
Reliability and Security
Principles of Antifragile Software
The goal of this paper is to study and define the concept of "antifragile
software". For this, I start from Taleb's statement that antifragile systems
love errors, and discuss whether traditional software dependability fits into
this class. The answer is somewhat negative, although adaptive fault tolerance
is antifragile: the system learns something when an error happens, and always
imrpoves. Automatic runtime bug fixing is changing the code in response to
errors, fault injection in production means injecting errors in business
critical software. I claim that both correspond to antifragility. Finally, I
hypothesize that antifragile development processes are better at producing
antifragile software systems.Comment: see https://refuses.github.io
Intrusion-aware Alert Validation Algorithm for Cooperative Distributed Intrusion Detection Schemes of Wireless Sensor Networks
Existing anomaly and intrusion detection schemes of wireless sensor networks
have mainly focused on the detection of intrusions. Once the intrusion is
detected, an alerts or claims will be generated. However, any unidentified
malicious nodes in the network could send faulty anomaly and intrusion claims
about the legitimate nodes to the other nodes. Verifying the validity of such
claims is a critical and challenging issue that is not considered in the
existing cooperative-based distributed anomaly and intrusion detection schemes
of wireless sensor networks. In this paper, we propose a validation algorithm
that addresses this problem. This algorithm utilizes the concept of
intrusion-aware reliability that helps to provide adequate reliability at a
modest communication cost. In this paper, we also provide a security resiliency
analysis of the proposed intrusion-aware alert validation algorithm.Comment: 19 pages, 7 figure
Design diversity: an update from research on reliability modelling
Diversity between redundant subsystems is, in various forms, a common design approach for improving system dependability. Its value in the case of software-based systems is still controversial. This paper gives an overview of reliability modelling work we carried out in recent projects on design diversity, presented in the context of previous knowledge and practice. These results provide additional insight for decisions in applying diversity and in assessing diverseredundant systems. A general observation is that, just as diversity is a very general design approach, the models of diversity can help conceptual understanding of a range of different situations. We summarise results in the general modelling of common-mode failure, in inference from observed failure data, and in decision-making for diversity in development.
The problems of assessing software reliability ...When you really need to depend on it
This paper looks at the ways in which the reliability of software can be assessed and predicted. It shows that the levels of reliability that can be claimed with scientific justification are relatively modest
- …