7,952 research outputs found
A Short Counterexample Property for Safety and Liveness Verification of Fault-tolerant Distributed Algorithms
Distributed algorithms have many mission-critical applications ranging from
embedded systems and replicated databases to cloud computing. Due to
asynchronous communication, process faults, or network failures, these
algorithms are difficult to design and verify. Many algorithms achieve fault
tolerance by using threshold guards that, for instance, ensure that a process
waits until it has received an acknowledgment from a majority of its peers.
Consequently, domain-specific languages for fault-tolerant distributed systems
offer language support for threshold guards.
We introduce an automated method for model checking of safety and liveness of
threshold-guarded distributed algorithms in systems where the number of
processes and the fraction of faulty processes are parameters. Our method is
based on a short counterexample property: if a distributed algorithm violates a
temporal specification (in a fragment of LTL), then there is a counterexample
whose length is bounded and independent of the parameters. We prove this
property by (i) characterizing executions depending on the structure of the
temporal formula, and (ii) using commutativity of transitions to accelerate and
shorten executions. We extended the ByMC toolset (Byzantine Model Checker) with
our technique, and verified liveness and safety of 10 prominent fault-tolerant
distributed algorithms, most of which were out of reach for existing
techniques.Comment: 16 pages, 11 pages appendi
Pinwheel Scheduling for Fault-tolerant Broadcast Disks in Real-time Database Systems
The design of programs for broadcast disks which incorporate real-time and fault-tolerance requirements is considered. A generalized model for real-time fault-tolerant broadcast disks is defined. It is shown that designing programs for broadcast disks specified in this model is closely related to the scheduling of pinwheel task systems. Some new results in pinwheel scheduling theory are derived, which facilitate the efficient generation of real-time fault-tolerant broadcast disk programs.National Science Foundation (CCR-9308344, CCR-9596282
A bibliography on formal methods for system specification, design and validation
Literature on the specification, design, verification, testing, and evaluation of avionics systems was surveyed, providing 655 citations. Journal papers, conference papers, and technical reports are included. Manual and computer-based methods were employed. Keywords used in the online search are listed
Recommended from our members
Enhancing Fault / Intrusion Tolerance through Design and Configuration Diversity
Fault/intrusion tolerance is usually the only viable way of improving the system dependability and security in the presence of continuously evolving threats. Many of the solutions in the literature concern a specific snapshot in the production or deployment of a fault-tolerant system and no immediate considerations are made about how the system should evolve to deal with novel threats. In this paper we outline and evaluate a set of operating systems’ and applications’ reconfiguration rules which can be used to modify the state of a system replica prior to deployment or in between recoveries, and hence increase the replicas chance of a longer intrusion-free operation
Assessment team report on flight-critical systems research at NASA Langley Research Center
The quality, coverage, and distribution of effort of the flight-critical systems research program at NASA Langley Research Center was assessed. Within the scope of the Assessment Team's review, the research program was found to be very sound. All tasks under the current research program were at least partially addressing the industry needs. General recommendations made were to expand the program resources to provide additional coverage of high priority industry needs, including operations and maintenance, and to focus the program on an actual hardware and software system that is under development
What is a quantum computer, and how do we build one?
The DiVincenzo criteria for implementing a quantum computer have been seminal
in focussing both experimental and theoretical research in quantum information
processing. These criteria were formulated specifically for the circuit model
of quantum computing. However, several new models for quantum computing
(paradigms) have been proposed that do not seem to fit the criteria well. The
question is therefore what are the general criteria for implementing quantum
computers. To this end, a formal operational definition of a quantum computer
is introduced. It is then shown that according to this definition a device is a
quantum computer if it obeys the following four criteria: Any quantum computer
must (1) have a quantum memory; (2) facilitate a controlled quantum evolution
of the quantum memory; (3) include a method for cooling the quantum memory; and
(4) provide a readout mechanism for subsets of the quantum memory. The criteria
are met when the device is scalable and operates fault-tolerantly. We discuss
various existing quantum computing paradigms, and how they fit within this
framework. Finally, we lay out a roadmap for selecting an avenue towards
building a quantum computer. This is summarized in a decision tree intended to
help experimentalists determine the most natural paradigm given a particular
physical implementation
- …