Quantifying the Resiliency of Fail-Operational Real-Time Networked Control Systems

Gujarati, Arpan; Nasri, Mitra

Quantifying the Resiliency of Fail-Operational Real-Time Networked Control Systems

Authors: Arpan Gujarati
Mitra Nasri
Publication date: 1 January 2018
Publisher: LIPIcs - Leibniz International Proceedings in Informatics. 30th Euromicro Conference on Real-Time Systems (ECRTS 2018)
Doi

Abstract

In time-sensitive, safety-critical systems that must be fail-operational, active replication is commonly used to mitigate transient faults that arise due to electromagnetic interference (EMI). However, designing an effective and well-performing active replication scheme is challenging since replication conflicts with the size, weight, power, and cost constraints of embedded applications. To enable a systematic and rigorous exploration of the resulting tradeoffs, we present an analysis to quantify the resiliency of fail-operational networked control systems against EMI-induced memory corruption, host crashes, and retransmission delays. Since control systems are typically robust to a few failed iterations, e.g., one missed actuation does not crash an inverted pendulum, traditional solutions based on hard real-time assumptions are often too pessimistic. Our analysis reduces this pessimism by modeling a control system\u27s inherent robustness as an (m,k)-firm specification. A case study with an active suspension workload indicates that the analytical bounds closely predict the failure rate estimates obtained through simulation, thereby enabling a meaningful design-space exploration, and also demonstrates the utility of the analysis in identifying non-trivial and non-obvious reliability tradeoffs

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Dagstuhl Research Online Publication Server

oai:drops-oai.dagstuhl.de:8988

Last time updated on 09/07/2018