Search CORE

47,700 research outputs found

An optimal fixed-priority assignment algorithm for supporting fault-tolerant hard real-time systems

Author: Alan Burns
George M. De A. Lima
Senior Member
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2003
Field of study

The main contribution of this paper is twofold. First, we present an appropriate schedulability analysis, based on response time analysis, for supporting fault-tolerant hard real-time systems. We consider systems that make use of error-recovery techniques to carry out fault tolerance. Second, we propose a new priority assignment algorithm which can be used, together with the schedulability analysis, to improve system fault resilience. These achievements come from the observation that traditional priority assignment policies may no longer be appropriate when faults are being considered. The proposed schedulability analysis takes into account the fact that the recoveries of tasks may be executed at higher priority levels. This characteristic is very important since, after an error, a task certainly has a shorter period of time to meet its deadline. The proposed priority assignment algorithm, which uses some properties of the analysis, is very efficient. We show that the method used to find out an appropriate priority assignment reduces the search space from O(n!) to O(n/sup 2/), where n is the number of task recovery procedures. Also, we show that the priority assignment algorithm is optimal in the sense that the fault resilience of task sets is maximized as for the proposed analysis. The effectiveness of the proposed approach is evaluated by simulation

CiteSeerX

Crossref

White Rose Research Online

Timing analysis for embedded systems using non-preemptive EDF scheduling under bounded error arrivals

Author: Abugchem
Allworth
Andrei
Bini
Broster
Buttazzo
Buttazzo
Caccamo
Elks
Gardner
Garey
George
Hughes
Ignat
Jeffay
Mosse
Nasri
Nasri
Normand
Pont
Pont
Shin
Short
Short
Short
Short
Short
Stankovic
Thekkilakattil1
Publication venue: 'Elsevier BV'
Publication date: 04/08/2016
Field of study

Embedded systems consist of one or more processing units which are completely encapsulated by the devices under their control, and they often have stringent timing constraints associated with their functional specification. Previous research has considered the performance of different types of task scheduling algorithm and developed associated timing analysis techniques for such systems. Although preemptive scheduling techniques have traditionally been favored, rapid increases in processor speeds combined with improved insights into the behavior of non-preemptive scheduling techniques have seen an increased interest in their use for real-time applications such as multimedia, automation and control. However when non-preemptive scheduling techniques are employed there is a potential lack of error confinement should any timing errors occur in individual software tasks. In this paper, the focus is upon adding fault tolerance in systems using non-preemptive deadline-driven scheduling. Schedulability conditions are derived for fault-tolerant periodic and sporadic task sets experiencing bounded error arrivals under non-preemptive deadline scheduling. A timing analysis algorithm is presented based upon these conditions and its run-time properties are studied. Computational experiments show it to be highly efficient in terms of run-time complexity and competitive ratio when compared to previous approaches

Crossref

Elsevier - Publisher Connector

Directory of Open Access Journals

Teeside University's Research Repository

Leveraging Weakly-hard Constraints for Improving System Fault Tolerance with Functional and Timing Guarantees

Author: Casini Daniel
Choi Hyunjong
Gao Yue
Hamann Arne
Huang Chao
Izosimov Viacheslav
Kumar Arvind
Miremadi Ghassem
Rehman Semeen
Sun Youcheng
Xu Wenbo
Äström Karl J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/08/2020
Field of study

Many safety-critical real-time systems operate under harsh environment and are subject to soft errors caused by transient or intermittent faults. It is critical and yet often very challenging to apply fault tolerance techniques in these systems, due to their resource limitations and stringent constraints on timing and functionality. In this work, we leverage the concept of weakly-hard constraints, which allows task deadline misses in a bounded manner, to improve system's capability to accommodate fault tolerance techniques while ensuring timing and functional correctness. In particular, we 1) quantitatively measure control cost under different deadline hit/miss scenarios and identify weak-hard constraints that guarantee control stability, 2) employ typical worst-case analysis (TWCA) to bound the number of deadline misses and approximate system control cost, 3) develop an event-based simulation method to check the task execution pattern and evaluate system control cost for any given solution and 4) develop a meta-heuristic algorithm that consists of heuristic methods and a simulated annealing procedure to explore the design space. Our experiments on an industrial case study and a set of synthetic examples demonstrate the effectiveness of our approach.Comment: ICCAD 202

arXiv.org e-Print Archive

Crossref

Computer Simulation of PMSM Motor with Five Phase Inverter Control using Signal Processing Techniques

Author: Asker Mshari Aead
Gaeid Khalaf S.
Mahdi Salam Razooky
Tawfeeq Nada N.
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/10/2018
Field of study

The signal processing techniques and computer simulation play an important role in the fault diagnosis and tolerance of all types of machines in the first step of design. Permanent magnet synchronous motor (PMSM) and five phase inverter with sine wave pulse width modulation (SPWM) strategy is developed. The PMSM speed is controlled by vector control. In this work, a fault tolerant control (FTC) system in the PMSM using wavelet switching is introduced. The feature extraction property of wavelet analysis used the error as obtained by the wavelet de-noised signal as input to the mechanism unit to decide the healthy system. The diagnosis algorithm, which depends on both wavelet and vector control to generate PWM as current based manage any parameter variation. An open-end phase PMSM has a larger range of speed regulation than normal PMSM. Simulation results confirm the validity and effectiveness of the switching strategy

IAES journal

Crossref

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institute of Advanced Engineering and Science

Improving Performance of Iterative Methods by Lossy Checkponting

Author: Acosta J. Mora
Agullo E.
Balay S.
Barrett R.
Barrett R.
Bode B.
Calhoun J.
Heath M. T.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 28/05/2018
Field of study

Iterative methods are commonly used approaches to solve large, sparse linear systems, which are fundamental operations for many modern scientific simulations. When the large-scale iterative methods are running with a large number of ranks in parallel, they have to checkpoint the dynamic variables periodically in case of unavoidable fail-stop errors, requiring fast I/O systems and large storage space. To this end, significantly reducing the checkpointing overhead is critical to improving the overall performance of iterative methods. Our contribution is fourfold. (1) We propose a novel lossy checkpointing scheme that can significantly improve the checkpointing performance of iterative methods by leveraging lossy compressors. (2) We formulate a lossy checkpointing performance model and derive theoretically an upper bound for the extra number of iterations caused by the distortion of data in lossy checkpoints, in order to guarantee the performance improvement under the lossy checkpointing scheme. (3) We analyze the impact of lossy checkpointing (i.e., extra number of iterations caused by lossy checkpointing files) for multiple types of iterative methods. (4)We evaluate the lossy checkpointing scheme with optimal checkpointing intervals on a high-performance computing environment with 2,048 cores, using a well-known scientific computation package PETSc and a state-of-the-art checkpoint/restart toolkit. Experiments show that our optimized lossy checkpointing scheme can significantly reduce the fault tolerance overhead for iterative methods by 23%~70% compared with traditional checkpointing and 20%~58% compared with lossless-compressed checkpointing, in the presence of system failures.Comment: 14 pages, 10 figures, HPDC'1

arXiv.org e-Print Archive

Crossref

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Parallelizing Deadlock Resolution in Symbolic Synthesis of Distributed Programs

Author: A. Arora
A. Arora
A. Ebnenasir
B. Bonakdarpour
Borzoo Bonakdarpour
E. A. Emerson
F. Abujarad
F. Somenzi
Fuad Abujarad
J. Ezekiel
J. Ezekiel
J. Ezekiel
Jaco van de Pol
K. Milvang-Jensen
L. Lamport
Lubos Brim
Maurice Herlihy
O. Grumberg
O. Grumberg
S. S. Kulkarni
S. S. Kulkarni
Sandeep S. Kulkarni
T. Stornetta
Publication venue: 'Open Publishing Association'
Publication date: 01/12/2009
Field of study

Previous work has shown that there are two major complexity barriers in the synthesis of fault-tolerant distributed programs: (1) generation of fault-span, the set of states reachable in the presence of faults, and (2) resolving deadlock states, from where the program has no outgoing transitions. Of these, the former closely resembles with model checking and, hence, techniques for efficient verification are directly applicable to it. Hence, we focus on expediting the latter with the use of multi-core technology. We present two approaches for parallelization by considering different design choices. The first approach is based on the computation of equivalence classes of program transitions (called group computation) that are needed due to the issue of distribution (i.e., inability of processes to atomically read and write all program variables). We show that in most cases the speedup of this approach is close to the ideal speedup and in some cases it is superlinear. The second approach uses traditional technique of partitioning deadlock states among multiple threads. However, our experiments show that the speedup for this approach is small. Consequently, our analysis demonstrates that a simple approach of parallelizing the group computation is likely to be the effective method for using multi-core computing in the context of deadlock resolution

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Algorithmic Based Fault Tolerance Applied to High Performance Computing

Author: Bosilca George
Delmas Remi
Dongarra Jack
Langou Julien
Publication venue
Publication date: 01/01/2008
Field of study

We present a new approach to fault tolerance for High Performance Computing system. Our approach is based on a careful adaptation of the Algorithmic Based Fault Tolerance technique (Huang and Abraham, 1984) to the need of parallel distributed computation. We obtain a strongly scalable mechanism for fault tolerance. We can also detect and correct errors (bit-flip) on the fly of a computation. To assess the viability of our approach, we have developed a fault tolerant matrix-matrix multiplication subroutine and we propose some models to predict its running time. Our parallel fault-tolerant matrix-matrix multiplication scores 1.4 TFLOPS on 484 processors (cluster jacquard.nersc.gov) and returns a correct result while one process failure has happened. This represents 65% of the machine peak efficiency and less than 12% overhead with respect to the fastest failure-free implementation. We predict (and have observed) that, as we increase the processor count, the overhead of the fault tolerance drops significantly

arXiv.org e-Print Archive

CiteSeerX

MIMS EPrints

The University of Manchester - Institutional Repository

Computer architecture for efficient algorithmic executions in real-time systems: New technology for avionics systems and advanced space vehicles

Author: Carroll Chester C.
Saha Aindam
Youngblood John N.
Publication venue
Publication date
Field of study

Improvements and advances in the development of computer architecture now provide innovative technology for the recasting of traditional sequential solutions into high-performance, low-cost, parallel system to increase system performance. Research conducted in development of specialized computer architecture for the algorithmic execution of an avionics system, guidance and control problem in real time is described. A comprehensive treatment of both the hardware and software structures of a customized computer which performs real-time computation of guidance commands with updated estimates of target motion and time-to-go is presented. An optimal, real-time allocation algorithm was developed which maps the algorithmic tasks onto the processing elements. This allocation is based on the critical path analysis. The final stage is the design and development of the hardware structures suitable for the efficient execution of the allocated task graph. The processing element is designed for rapid execution of the allocated tasks. Fault tolerance is a key feature of the overall architecture. Parallel numerical integration techniques, tasks definitions, and allocation algorithms are discussed. The parallel implementation is analytically verified and the experimental results are presented. The design of the data-driven computer architecture, customized for the execution of the particular algorithm, is discussed

NASA Technical Reports Server