20,282 research outputs found
The Art of Fault Injection
Classical greek philosopher considered the foremost virtues to be temperance, justice, courage, and prudence. In this paper we relate these cardinal virtues to the correct methodological approaches that researchers should follow when setting up a fault injection experiment. With this work we try to understand where the "straightforward pathway" lies, in order to highlight those common methodological errors that deeply influence the coherency and the meaningfulness of fault injection experiments. Fault injection is like an art, where the success of the experiments depends on a very delicate balance between modeling, creativity, statistics, and patience
Software reliability through fault-avoidance and fault-tolerance
The use of back-to-back, or comparison, testing for regression test or porting is examined. The efficiency and the cost of the strategy is compared with manual and table-driven single version testing. Some of the key parameters that influence the efficiency and the cost of the approach are the failure identification effort during single version program testing, the extent of implemented changes, the nature of the regression test data (e.g., random), and the nature of the inter-version failure correlation and fault-masking. The advantages and disadvantages of the technique are discussed, together with some suggestions concerning its practical use
Assessment team report on flight-critical systems research at NASA Langley Research Center
The quality, coverage, and distribution of effort of the flight-critical systems research program at NASA Langley Research Center was assessed. Within the scope of the Assessment Team's review, the research program was found to be very sound. All tasks under the current research program were at least partially addressing the industry needs. General recommendations made were to expand the program resources to provide additional coverage of high priority industry needs, including operations and maintenance, and to focus the program on an actual hardware and software system that is under development
Recommended from our members
Uncertainty explicit assessment of off-the-shelf software: Selection of an optimal diverse pair
Assessment of software COTS components is an essential part of component-based software development. Sub-optimal selection of components may lead to solutions with low quality. The assessment is based on incomplete knowledge about the COTS components themselves and other aspects, which may affect the choice such as the vendor's credentials, etc. We argue in favor of assessment methods in which uncertainty is explicitly represented (`uncertainty explicit' methods) using probability distributions. We have adapted a model (developed elsewhere by Littlewood, B. et al. (2000)) for assessment of a pair of COTS components to take account of the fault (bug) logs that might be available for the COTS components being assessed. We also provide empirical data from a study we have conducted with off-the-shelf database servers, which illustrate the use of the method
Modeling and measurement of fault-tolerant multiprocessors
The workload effects on computer performance are addressed first for a highly reliable unibus multiprocessor used in real-time control. As an approach to studing these effects, a modified Stochastic Petri Net (SPN) is used to describe the synchronous operation of the multiprocessor system. From this model the vital components affecting performance can be determined. However, because of the complexity in solving the modified SPN, a simpler model, i.e., a closed priority queuing network, is constructed that represents the same critical aspects. The use of this model for a specific application requires the partitioning of the workload into job classes. It is shown that the steady state solution of the queuing model directly produces useful results. The use of this model in evaluating an existing system, the Fault Tolerant Multiprocessor (FTMP) at the NASA AIRLAB, is outlined with some experimental results. Also addressed is the technique of measuring fault latency, an important microscopic system parameter. Most related works have assumed no or a negligible fault latency and then performed approximate analyses. To eliminate this deficiency, a new methodology for indirectly measuring fault latency is presented
Introducing the STAMP method in road tunnel safety assessment
After the tremendous accidents in European road tunnels over the past decade, many risk assessment methods have been proposed worldwide, most of them based on Quantitative Risk Assessment (QRA). Although QRAs are helpful to address physical aspects and facilities of tunnels, current approaches in the road tunnel field have limitations to model organizational aspects, software behavior and the adaptation of the tunnel system over time. This paper reviews the aforementioned limitations and highlights the need to enhance the safety assessment process of these critical infrastructures with a complementary approach that links the organizational factors to the operational and technical issues, analyze software behavior and models the dynamics of the tunnel system. To achieve this objective, this paper examines the scope for introducing a safety assessment method which is based on the systems thinking paradigm and draws upon the STAMP model. The method proposed is demonstrated through a case study of a tunnel ventilation system and the results show that it has the potential to identify scenarios that encompass both the technical system and the organizational structure. However, since the method does not provide quantitative estimations of risk, it is recommended to be used as a complementary approach to the traditional risk assessments rather than as an alternative. (C) 2012 Elsevier Ltd. All rights reserved
Recommended from our members
Modeling software design diversity
Design diversity has been used for many years now as a means of achieving a degree of fault tolerance in software-based systems. Whilst there is clear evidence that the approach can be expected to deliver some increase in reliability compared with a single version, there is not agreement about the extent of this. More importantly, it remains difficult to evaluate exactly how reliable a particular diverse fault-tolerant system is. This difficulty arises because assumptions of independence of failures between different versions have been shown not to be tenable: assessment of the actual level of dependence present is therefore needed, and this is hard. In this tutorial we survey the modelling issues here, with an emphasis upon the impact these have upon the problem of assessing the reliability of fault tolerant systems. The intended audience is one of designers, assessors and project managers with only a basic knowledge of probabilities, as well as reliability experts without detailed knowledge of software, who seek an introduction to the probabilistic issues in decisions about design diversity
Evaluation Applied to Reliability Analysis of Reconfigurable, Highly Reliable, Fault-Tolerant, Computing Systems for Avionics
Emulation techniques are proposed as a solution to a difficulty arising in the analysis of the reliability of highly reliable computer systems for future commercial aircraft. The difficulty, viz., the lack of credible precision in reliability estimates obtained by analytical modeling techniques are established. The difficulty is shown to be an unavoidable consequence of: (1) a high reliability requirement so demanding as to make system evaluation by use testing infeasible, (2) a complex system design technique, fault tolerance, (3) system reliability dominated by errors due to flaws in the system definition, and (4) elaborate analytical modeling techniques whose precision outputs are quite sensitive to errors of approximation in their input data. The technique of emulation is described, indicating how its input is a simple description of the logical structure of a system and its output is the consequent behavior. The use of emulation techniques is discussed for pseudo-testing systems to evaluate bounds on the parameter values needed for the analytical techniques
- …