108,602 research outputs found
Recommended from our members
Modeling software design diversity
Design diversity has been used for many years now as a means of achieving a degree of fault tolerance in software-based systems. Whilst there is clear evidence that the approach can be expected to deliver some increase in reliability compared with a single version, there is not agreement about the extent of this. More importantly, it remains difficult to evaluate exactly how reliable a particular diverse fault-tolerant system is. This difficulty arises because assumptions of independence of failures between different versions have been shown not to be tenable: assessment of the actual level of dependence present is therefore needed, and this is hard. In this tutorial we survey the modelling issues here, with an emphasis upon the impact these have upon the problem of assessing the reliability of fault tolerant systems. The intended audience is one of designers, assessors and project managers with only a basic knowledge of probabilities, as well as reliability experts without detailed knowledge of software, who seek an introduction to the probabilistic issues in decisions about design diversity
Multiversion software reliability through fault-avoidance and fault-tolerance
In this project we have proposed to investigate a number of experimental and theoretical issues associated with the practical use of multi-version software in providing dependable software through fault-avoidance and fault-elimination, as well as run-time tolerance of software faults. In the period reported here we have working on the following: We have continued collection of data on the relationships between software faults and reliability, and the coverage provided by the testing process as measured by different metrics (including data flow metrics). We continued work on software reliability estimation methods based on non-random sampling, and the relationship between software reliability and code coverage provided through testing. We have continued studying back-to-back testing as an efficient mechanism for removal of uncorrelated faults, and common-cause faults of variable span. We have also been studying back-to-back testing as a tool for improvement of the software change process, including regression testing. We continued investigating existing, and worked on formulation of new fault-tolerance models. In particular, we have partly finished evaluation of Consensus Voting in the presence of correlated failures, and are in the process of finishing evaluation of Consensus Recovery Block (CRB) under failure correlation. We find both approaches far superior to commonly employed fixed agreement number voting (usually majority voting). We have also finished a cost analysis of the CRB approach
Recommended from our members
Empirical evaluation of accuracy of mathematical software used for availability assessment of fault-tolerant computer systems
Dependability assessment is typically based on complex probabilistic models. Markov and semi-Markov models are widely used to model dependability of complex hardware/software architectures. Solving such models, especially when they are stiff, is not trivial and is usually done using sophisticated mathematical software packages. We report a practical experience of comparing the accuracy of solutions stiff Markov models obtained using well known commercial and research software packages. The study is conducted on a contrived but realistic cases study of computer system with hardware redundancy and diverse software under the assumptions that the rate of failure of software may vary over time, a realistic assumption. We observe that the disagreement between the solutions obtained with the different packages may be very significant. We discuss these findings and directions for future research
Ultra-Reliable Low Latency Communication (URLLC) using Interface Diversity
An important ingredient of the future 5G systems will be Ultra-Reliable
Low-Latency Communication (URLLC). A way to offer URLLC without intervention in
the baseband/PHY layer design is to use interface diversity and integrate
multiple communication interfaces, each interface based on a different
technology. In this work, we propose to use coding to seamlessly distribute
coded payload and redundancy data across multiple available communication
interfaces. We formulate an optimization problem to find the payload allocation
weights that maximize the reliability at specific target latency values. In
order to estimate the performance in terms of latency and reliability of such
an integrated communication system, we propose an analysis framework that
combines traditional reliability models with technology-specific latency
probability distributions. Our model is capable to account for failure
correlation among interfaces/technologies. By considering different scenarios,
we find that optimized strategies can in some cases significantly outperform
strategies based on -out-of- erasure codes, where the latter do not
account for the characteristics of the different interfaces. The model has been
validated through simulation and is supported by experimental results.Comment: Accepted for IEEE Transactions on Communication
Beam Loss Monitors at LHC
One of the main functions of the LHC beam loss measurement system is the
protection of equipment against damage caused by impacting particles creating
secondary showers and their energy dissipation in the matter. Reliability
requirements are scaled according to the acceptable consequences and the
frequency of particle impact events on equipment. Increasing reliability often
leads to more complex systems. The downside of complexity is a reduction of
availability; therefore, an optimum has to be found for these conflicting
requirements. A detailed review of selected concepts and solutions for the LHC
system will be given to show approaches used in various parts of the system
from the sensors, signal processing, and software implementations to the
requirements for operation and documentation.Comment: 16 pages, contribution to the 2014 Joint International Accelerator
School: Beam Loss and Accelerator Protection, Newport Beach, CA, USA , 5-14
Nov 201
Study of a unified hardware and software fault-tolerant architecture
A unified architectural concept, called the Fault Tolerant Processor Attached Processor (FTP-AP), that can tolerate hardware as well as software faults is proposed for applications requiring ultrareliable computation capability. An emulation of the FTP-AP architecture, consisting of a breadboard Motorola 68010-based quadruply redundant Fault Tolerant Processor, four VAX 750s as attached processors, and four versions of a transport aircraft yaw damper control law, is used as a testbed in the AIRLAB to examine a number of critical issues. Solutions of several basic problems associated with N-Version software are proposed and implemented on the testbed. This includes a confidence voter to resolve coincident errors in N-Version software. A reliability model of N-Version software that is based upon the recent understanding of software failure mechanisms is also developed. The basic FTP-AP architectural concept appears suitable for hosting N-Version application software while at the same time tolerating hardware failures. Architectural enhancements for greater efficiency, software reliability modeling, and N-Version issues that merit further research are identified
Impact of Equipment Failures and Wind Correlation on Generation Expansion Planning
Generation expansion planning has become a complex problem within a
deregulated electricity market environment due to all the uncertainties
affecting the profitability of a given investment. Current expansion models
usually overlook some of these uncertainties in order to reduce the
computational burden. In this paper, we raise a flag on the importance of both
equipment failures (units and lines) and wind power correlation on generation
expansion decisions. For this purpose, we use a bilevel stochastic optimization
problem, which models the sequential and noncooperative game between the
generating company (GENCO) and the system operator. The upper-level problem
maximizes the GENCO's expected profit, while the lower-level problem simulates
an hourly market-clearing procedure, through which LMPs are determined. The
uncertainty pertaining to failures and wind power correlation are characterized
by a scenario set, and their impact on generation expansion decisions are
quantified and discussed for a 24-bus power system
Recommended from our members
Stochastic modelling of the effects of interdependencies between critical infrastructure
An approach to Quantitative Interdependency Analysis, in the context of Large Complex Critical Infrastructures, is presented in this paper. A Discrete state–space, Continuous–time, Stochastic Process models the operation of critical infrastructure, taking interdependencies into account. Of primary interest are the implications of both model detail (that is, level of model abstraction) and model parameterisation for the study of dependencies. Both of these factors are observed to affect the distribution of cascade–sizes within and across infrastructure
- …