Search CORE

1,033 research outputs found

A fault-tolerant multiprocessor architecture for aircraft, volume 1

Author: Ausrotas R. A.
Hanley L. D.
Hopkins A. L.
Lala J. H.
Martin J. H.
Smith T. B.
Taylor W.
Publication venue
Publication date
Field of study

A fault-tolerant multiprocessor architecture is reported. This architecture, together with a comprehensive information system architecture, has important potential for future aircraft applications. A preliminary definition and assessment of a suitable multiprocessor architecture for such applications is developed

NASA Technical Reports Server

ATMP: An Adaptive Tolerance-based Mixed-criticality Protocol for Multi-core Systems

Author: Iacovelli Saverio
Kirner Raimund
Menon Catherine
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/06/2018
Field of study

© 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted ncomponent of this work in other works.The challenge of mixed-criticality scheduling is to keep tasks of higher criticality running in case of resource shortages caused by faults. Traditionally, mixedcriticality scheduling has focused on methods to handle faults where tasks overrun their optimistic worst-case execution time (WCET) estimate. In this paper we present the Adaptive Tolerance based Mixed-criticality Protocol (ATMP), which generalises the concept of mixed-criticality scheduling to handle also faults of other nature, like failure of cores in a multi-core system. ATMP is an adaptation method triggered by resource shortage at runtime. The first step of ATMP is to re-partition the task to the available cores and the second step is to optimise the utility at each core using the tolerance-based real-time computing model (TRTCM). The evaluation shows that the utility optimisation of ATMP can achieve a smoother degradation of service compared to just abandoning tasks

Crossref

University of Hertfordshire Research Archive

Adaptive Fault Tolerance and Graceful Degradation Under Dynamic Hard Real-time Scheduling

Author: González Oscar
Ramamritham Krithi
Shrikumar H.
Stankovic John A.
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/1997
Field of study

Static redundancy allocation is inappropriate in hard realtime systems that operate in variable and dynamic environments, (e.g., radar tracking, avionics). Adaptive Fault Tolerance (AFT) can assure adequate reliability of critical modules, under temporal and resources constraints, by allocating just as much redundancy to less critical modules as can be afforded, thus gracefully reducing their resource requirement. In this paper, we propose a mechanism for supporting adaptive fault tolerance in a real-time system. Adaptation is achieved by choosing a suitable redundancy strategy for a dynamically arriving computation to assure required reliability and to maximize the potential for fault tolerance while ensuring that deadlines are met. The proposed approach is evaluated using a real-life workload simulating radar tracking software in AWACS early warning aircraft. The results demonstrate that our technique outperforms static fault tolerance strategies in terms of tasks meeting their timing constraints. Further, we show that the gain in this timing-centric performance metric does not reduce the fault tolerance of the executing tasks below a predefined minimum level. Overall, the evaluation indicates that the proposed ideas result in a system that dynamically provides QOS guarantees along the fault-tolerance dimension

CiteSeerX

ScholarWorks@UMass Amherst

A Survey of Research into Mixed Criticality Systems

Author: Burns Alan
Davis Robert Ian
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 22/11/2017
Field of study

This survey covers research into mixed criticality systems that has been published since Vestal’s seminal paper in 2007, up until the end of 2016. The survey is organised along the lines of the major research areas within this topic. These include single processor analysis (including fixed priority and EDF scheduling, shared resources and static and synchronous scheduling), multiprocessor analysis, realistic models, and systems issues. The survey also explores the relationship between research into mixed criticality systems and other topics such as hard and soft time constraints, fault tolerant scheduling, hierarchical scheduling, cyber physical systems, probabilistic real-time systems, and industrial safety standards

White Rose Research Online

Data-driven extraction and analysis of repairable fault trees from time series data

Author: Lazarova-Molnar Sanja
Niloofar Parisa
Publication venue: Elsevier
Publication date: 12/01/2023
Field of study

Fault tree analysis is a probability-based technique for estimating the risk of an undesired top event, typically a system failure. Traditionally, building a fault tree requires involvement of knowledgeable experts from different fields, relevant for the system under study. Nowadays’ systems, however, integrate numerous Internet of Things (IoT) devices and are able to generate large amounts of data that can be utilized to extract fault trees that reflect the true fault-related behavior of the corresponding systems. This is especially relevant as systems typically change their behaviors during their lifetimes, rendering initial fault trees obsolete. For this reason, we are interested in extracting fault trees from data that is generated from systems during their lifetimes. We present DDFTAnb algorithm for learning fault trees of systems using time series data from observed faults, enhanced with Naïve Bayes classifiers for estimating the future fault-related behavior of the system for unobserved combinations of basic events, where the state of the top event is unknown. Our proposed algorithm extracts repairable fault trees from multinomial time series data, classifies the top event for the unseen combinations of basic events, and then uses proxel-based simulation to estimate the system’s reliability. We, furthermore, assess the sensitivity of our algorithm to different percentages of data availabilities. Results indicate DDFTAnb’s high performance for low levels of data availability, however, when there are sufficient or high amounts of data, there is no need for classifying the top event

KITopen

Fault-free performance validation of fault-tolerant multiprocessors

Author: Czeck Edward W.
Feather Frank E.
Grizzaffi Ann Marie
Segall Zary Z.
Siewiorek Daniel P.
Publication venue
Publication date
Field of study

A validation methodology for testing the performance of fault-tolerant computer systems was developed and applied to the Fault-Tolerant Multiprocessor (FTMP) at NASA-Langley's AIRLAB facility. This methodology was claimed to be general enough to apply to any ultrareliable computer system. The goal of this research was to extend the validation methodology and to demonstrate the robustness of the validation methodology by its more extensive application to NASA's Fault-Tolerant Multiprocessor System (FTMP) and to the Software Implemented Fault-Tolerance (SIFT) Computer System. Furthermore, the performance of these two multiprocessors was compared by conducting similar experiments. An analysis of the results shows high level language instruction execution times for both SIFT and FTMP were consistent and predictable, with SIFT having greater throughput. At the operating system level, FTMP consumes 60% of the throughput for its real-time dispatcher and 5% on fault-handling tasks. In contrast, SIFT consumes 16% of its throughput for the dispatcher, but consumes 66% in fault-handling software overhead

NASA Technical Reports Server

Numerical aerodynamic simulation facility feasibility study, executive summary

Author
Publication venue
Publication date
Field of study

There were three major issues examined in the feasibility study. First, the ability of the proposed system architecture to support the anticipated workload was evaluated. Second, the throughput of the computational engine (the flow model processor) was studied using real application programs. Third, the availability, reliability, and maintainability of the system were modeled. The evaluations were based on the baseline systems. The results show that the implementation of the Numerical Aerodynamic Simulation Facility, in the form considered, would indeed be a feasible project with an acceptable level of risk. The technology required (both hardware and software) either already exists or, in the case of a few parts, is expected to be announced this year

NASA Technical Reports Server

Problems related to the integration of fault tolerant aircraft electronic systems

Author: Adlakha V.
Alspaugh T. A., Jr.
Bannister J. A.
Triyedi K.
Publication venue
Publication date
Field of study

Problems related to the design of the hardware for an integrated aircraft electronic system are considered. Taxonomies of concurrent systems are reviewed and a new taxonomy is proposed. An informal methodology intended to identify feasible regions of the taxonomic design space is described. Specific tools are recommended for use in the methodology. Based on the methodology, a preliminary strawman integrated fault tolerant aircraft electronic system is proposed. Next, problems related to the programming and control of inegrated aircraft electronic systems are discussed. Issues of system resource management, including the scheduling and allocation of real time periodic tasks in a multiprocessor environment, are treated in detail. The role of software design in integrated fault tolerant aircraft electronic systems is discussed. Conclusions and recommendations for further work are included

NASA Technical Reports Server

System-on-Chip design for reliability

Author: Choi Minsu
Publication venue
Publication date: 01/08/2002
Field of study

SHAREOK repository

Design of a fault tolerant airborne digital computer. Volume 1: Architecture

Author: Goldberg J.
Green M. W.
Levitt K. N.
Neumann P. G.
Wensley J. H.
Publication venue
Publication date
Field of study

This volume is concerned with the architecture of a fault tolerant digital computer for an advanced commercial aircraft. All of the computations of the aircraft, including those presently carried out by analogue techniques, are to be carried out in this digital computer. Among the important qualities of the computer are the following: (1) The capacity is to be matched to the aircraft environment. (2) The reliability is to be selectively matched to the criticality and deadline requirements of each of the computations. (3) The system is to be readily expandable. contractible, and (4) The design is to appropriate to post 1975 technology. Three candidate architectures are discussed and assessed in terms of the above qualities. Of the three candidates, a newly conceived architecture, Software Implemented Fault Tolerance (SIFT), provides the best match to the above qualities. In addition SIFT is particularly simple and believable. The other candidates, Bus Checker System (BUCS), also newly conceived in this project, and the Hopkins multiprocessor are potentially more efficient than SIFT in the use of redundancy, but otherwise are not as attractive

NASA Technical Reports Server