Search CORE

20,566 research outputs found

Synthesis of fault-tolerant distributed systems

Author: Dimitrova R.
Finkbeiner B.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/10/2009
Field of study

A distributed system is fault-tolerant if it continues to perform correctly even when a subset of the processes becomes faulty. Fault-tolerance is highly desirable but often difficult to implement. In this paper, we investigate fault-tolerant synthesis, i.e., the problem of determining whether a given temporal specification can be implemented as a fault-tolerant distributed system. As in standard distributed synthesis, we assume that the specification of the correct behaviors is given as a temporal formula over the externally visible variables. Additionally, we introduce the fault-tolerance specification, a CTL* formula describing the effects and the duration of faults. If, at some point in time, a process becomes faulty, it becomes part of the external environment and its further behavior is only restricted by the fault-tolerance specification. This allows us to model a large variety of fault types. Our method accounts for the effect of faults on the values communicated by the processes, and, hence, on the information available to the non-faulty processes. We prove that for fully connected system architectures, i.e., for systems where each pair of processes is connected by a communication link, the fault-tolerant synthesis problem from CTL* specifications is 2EXPTIME-complete

White Rose Research Online

Automated Synthesis of Timed and Distributed Fault-Tolerant Systems

Author: Faghihekhorasani Fathiyeh
Publication venue: 'University of Waterloo'
Publication date: 01/01/2015
Field of study

This dissertation concentrates on the problem of automated synthesis and repair of fault-tolerant systems. In particular, given the required specification of the system, our goal is to synthesize a fault-tolerant system, or repair an existing one. We study this problem for two classes of timed and distributed systems. In the context of timed systems, we focus on efficient synthesis of fault-tolerant timed models from their fault-intolerant version. Although the complexity of the synthesis problem is known to be polynomial time in the size of the time-abstract bisimulation of the input model, the state of the art lacked synthesis algorithms that can be efficiently implemented. This is in part due to the fact that synthesis is in general a challenging problem and its complexity is significantly magnified in the context of timed systems. We propose an algorithm that takes a timed automaton, a set of fault actions, and a set of safety and bounded-time response properties as input, and utilizes a space-efficient symbolic representation of the timed automaton (called the zone graph) to synthesize a fault-tolerant timed automaton as output. The output automaton satisfies strict phased recovery, where it is guaranteed that the output model behaves similarly to the input model in the absence of faults and in the presence of faults, fault recovery is achieved in two phases, each satisfying certain safety and timing constraints. In the context of distributed systems, we study the problem of synthesizing fault-tolerant systems from their intolerant versions, when the number of processes is unknown. To synthesize a distributed fault-tolerant protocol that works for systems with any number of processes, we use counter abstraction. Using this abstraction, we deal with a finite-state abstract model to do the synthesis. Applying our proposed algorithm, we successfully synthesized a fault-tolerant distributed agreement protocol in the presence of Byzantine fault. Although the synthesis problem is known to be NP-complete in the state space of the input protocol (due to partial observability of processes) in the non-parameterized setting, our parameterized algorithm manages to synthesize a solution for a complex problem such as Byzantine agreement within less than two minutes. A system may reach a bad state due to wrong initialization or fault occurrence. One of the well-known types of distributed fault-tolerant systems are self-stabilizing systems. These are the systems that converge to their legitimate states starting from any state, and if no fault occurs, stay in legitimate states thereafter. We propose an automated sound and complete method to synthesize self-stabilizing systems starting from the desired topology and type of the system. Our proposed method is based on SMT-solving, where the desired specification of the system is formulated as SMT constraints. We used the Alloy solver to implement our method, and successfully synthesized some of the well-known self-stabilizing algorithms. We extend our method to support a type of stabilizing algorithm called ideal-stabilization, and also the case when the set of legitimate states is not explicitly known. Quantitative metrics such as recovery time are crucial in self-stabilizing systems when used in practice (such as in networking applications). One of these metrics is the average recovery time. Our automated method for synthesizing self-stabilizing systems generate some solution that respects the desired system specification, but it does not take into account any quantitative metrics. We study the problem of repairing self-stabilizing systems (where only removal of transitions is allowed) to satisfy quantitative limitations. The metric under study is average recovery time, which characterizes the performance of stabilizing programs. We show that the repair problem is NP-complete in the state space of the given system

University of Waterloo's Institutional Repository

A formal approach for the synthesis and implementation of fault-tolerant industrial embedded systems

Author: Delaval Gwenaël
Girault Alain
Sun Wei-Tsun
Publication venue: HAL CCSD
Publication date: 08/06/2015
Field of study

International audienceWe demonstrate the feasibility of a complete workflow to synthesize and implement correct-by-construction fault tolerant distributed embedded systems consisting of real-time periodic tasks. Correct-by-construction is provided by the use of discrete controller synthesis (DCS), a formal method thanks to which we are able to guarantee that the synthesized controlled system guarantees the functionality of its tasks even in the presence of processor failures. For this step, our workflow uses the Heptagon domain specific language and the Sigali DCS tool. The correct implementation of the resulting distributed system is a challenge, all the more since the controller itself must be tolerant to the processor failures. We achieve this step thanks to the libDGALS real-time library (1) to generate the glue code that will migrate the tasks upon processor failures, maintaining their internal state through migration, and (2) to make the synthesized controller itself fault-tolerant

Hal - Université Grenoble Alpes

Multicriteria optimal reconfiguration of fault-tolerant real-time tasks

Author: Dumitrescu Emil
Girault Alain
Marchand Hervé
Rutten Eric
Publication venue: 'Elsevier BV'
Publication date: 01/08/2010
Field of study

International audienceWe propose a technique for discrete controller synthesis, with optimal synthesis on bounded paths, in order to model, design, and optimize fault-tolerant distributed systems, taking into account several criteria (e.g., the execution costs of the tasks and their quality of service). Different combinations are explored for multi-criteria optimizatio

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL Descartes

Energy-Aware Synthesis of Fault-Tolerant Schedules for Real-Time Distributed Embedded Systems

Author: Izosimov Viacheslav
Pop Paul
Poulsen Kåre Harbo
Publication venue
Publication date: 01/01/2007
Field of study

Online Research Database In Technology

Parallelizing Deadlock Resolution in Symbolic Synthesis of Distributed Programs

Author: A. Arora
A. Arora
A. Ebnenasir
B. Bonakdarpour
Borzoo Bonakdarpour
E. A. Emerson
F. Abujarad
F. Somenzi
Fuad Abujarad
J. Ezekiel
J. Ezekiel
J. Ezekiel
Jaco van de Pol
K. Milvang-Jensen
L. Lamport
Lubos Brim
Maurice Herlihy
O. Grumberg
O. Grumberg
S. S. Kulkarni
S. S. Kulkarni
Sandeep S. Kulkarni
T. Stornetta
Publication venue: 'Open Publishing Association'
Publication date: 01/12/2009
Field of study

Previous work has shown that there are two major complexity barriers in the synthesis of fault-tolerant distributed programs: (1) generation of fault-span, the set of states reachable in the presence of faults, and (2) resolving deadlock states, from where the program has no outgoing transitions. Of these, the former closely resembles with model checking and, hence, techniques for efficient verification are directly applicable to it. Hence, we focus on expediting the latter with the use of multi-core technology. We present two approaches for parallelization by considering different design choices. The first approach is based on the computation of equivalence classes of program transitions (called group computation) that are needed due to the issue of distribution (i.e., inability of processes to atomically read and write all program variables). We show that in most cases the speedup of this approach is close to the ideal speedup and in some cases it is superlinear. The second approach uses traditional technique of partitioning deadlock states among multiple threads. However, our experiments show that the speedup for this approach is small. Consequently, our analysis demonstrates that a simple approach of parallelizing the group computation is likely to be the effective method for using multi-core computing in the context of deadlock resolution

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals