4,544 research outputs found
Rapid Recovery for Systems with Scarce Faults
Our goal is to achieve a high degree of fault tolerance through the control
of a safety critical systems. This reduces to solving a game between a
malicious environment that injects failures and a controller who tries to
establish a correct behavior. We suggest a new control objective for such
systems that offers a better balance between complexity and precision: we seek
systems that are k-resilient. In order to be k-resilient, a system needs to be
able to rapidly recover from a small number, up to k, of local faults
infinitely many times, provided that blocks of up to k faults are separated by
short recovery periods in which no fault occurs. k-resilience is a simple but
powerful abstraction from the precise distribution of local faults, but much
more refined than the traditional objective to maximize the number of local
faults. We argue why we believe this to be the right level of abstraction for
safety critical systems when local faults are few and far between. We show that
the computational complexity of constructing optimal control with respect to
resilience is low and demonstrate the feasibility through an implementation and
experimental results.Comment: In Proceedings GandALF 2012, arXiv:1210.202
Redundancy management for efficient fault recovery in NASA's distributed computing system
The management of redundancy in computer systems was studied and guidelines were provided for the development of NASA's fault-tolerant distributed systems. Fault recovery and reconfiguration mechanisms were examined. A theoretical foundation was laid for redundancy management by efficient reconfiguration methods and algorithmic diversity. Algorithms were developed to optimize the resources for embedding of computational graphs of tasks in the system architecture and reconfiguration of these tasks after a failure has occurred. The computational structure represented by a path and the complete binary tree was considered and the mesh and hypercube architectures were targeted for their embeddings. The innovative concept of Hybrid Algorithm Technique was introduced. This new technique provides a mechanism for obtaining fault tolerance while exhibiting improved performance
Clustering Algorithms for Scale-free Networks and Applications to Cloud Resource Management
In this paper we introduce algorithms for the construction of scale-free
networks and for clustering around the nerve centers, nodes with a high
connectivity in a scale-free networks. We argue that such overlay networks
could support self-organization in a complex system like a cloud computing
infrastructure and allow the implementation of optimal resource management
policies.Comment: 14 pages, 8 Figurs, Journa
Generating Property-Directed Potential Invariants By Backward Analysis
This paper addresses the issue of lemma generation in a k-induction-based
formal analysis of transition systems, in the linear real/integer arithmetic
fragment. A backward analysis, powered by quantifier elimination, is used to
output preimages of the negation of the proof objective, viewed as unauthorized
states, or gray states. Two heuristics are proposed to take advantage of this
source of information. First, a thorough exploration of the possible
partitionings of the gray state space discovers new relations between state
variables, representing potential invariants. Second, an inexact exploration
regroups and over-approximates disjoint areas of the gray state space, also to
discover new relations between state variables. k-induction is used to isolate
the invariants and check if they strengthen the proof objective. These
heuristics can be used on the first preimage of the backward exploration, and
each time a new one is output, refining the information on the gray states. In
our context of critical avionics embedded systems, we show that our approach is
able to outperform other academic or commercial tools on examples of interest
in our application field. The method is introduced and motivated through two
main examples, one of which was provided by Rockwell Collins, in a
collaborative formal verification framework.Comment: In Proceedings FTSCS 2012, arXiv:1212.657
- …