290 research outputs found

    Formal verification of a fault tolerant clock synchronization algorithm

    Get PDF
    A formal specification and mechanically assisted verification of the interactive convergence clock synchronization algorithm of Lamport and Melliar-Smith is described. Several technical flaws in the analysis given by Lamport and Melliar-Smith were discovered, even though their presentation is unusally precise and detailed. It seems that these flaws were not detected by informal peer scrutiny. The flaws are discussed and a revised presentation of the analysis is given that not only corrects the flaws but is also more precise and easier to follow. Some of the corrections to the flaws require slight modifications to the original assumptions underlying the algorithm and to the constraints on its parameters, and thus change the external specifications of the algorithm. The formal analysis of the interactive convergence clock synchronization algorithm was performed using the Enhanced Hierarchical Development Methodology (EHDM) formal specification and verification environment. This application of EHDM provides a demonstration of some of the capabilities of the system

    Structuring fault-tolerant object-oriented systems using inheritance and delegation

    Get PDF
    PhD ThesisMany entities in the real world that a software system has to interact with, e.g., for controlling or monitoring purposes, exhibit different behaviour phases in their lifetime, in particular depending on whether or not they are functioning correctly. That is, these entities exhibit not only a normal behaviour phase but also one or more abnormal behaviour phases associated with the various faults which occur in the environment. These faults are referred to as environmental faults. In the object-oriented software, real-world entities are modeled as objects. In a classbased object-oriented language, such as C++, all objects of a given class must follow the same external behaviour, i.e., they have the same interface and associated implementation. However this requires that each object permanently belong to a particular class, imposing constraints on the mutability of the behaviour for an individual object. This thesis proposes solutions to the problem of finding means whereby objects representing real-world entities which exhibit various behaviour phases can make corresponding changes in their own behaviour in a clear and explicit way, rather than through status-checking code which is normally embedded in the implementation of various methods. Our proposed solution is (i) to define a hierarchy of different subclasses related to an object which corresponds to an external entity, each subclass implementing a different behaviour phase that the external entity can exhibit, and (ii) to arrange that each object forward the execution of its operations to the currently appropriate instance of this hierarchy of subclasses. We thus propose an object-oriented approach for the provision of environmental fault tolerance, which encapsulates the abnormal behaviour of "faulty" entities as objects (instances of the above mentioned subclasses). These abnormal behaviour variants are defined statically, and runtime access to them is implemented through a delegation mechanism which depends on the current phase of behaviour. Thus specific reconfiguration changes at the level of objects can be easily incorporated to a software system for tolerating environmental faults

    Version Control Is for Your Data Too

    Get PDF
    Programmers regularly use distributed version control systems (DVCS) such as Git to facilitate collaborative software development. The primary purpose of a DVCS is to maintain integrity of source code in the presence of concurrent, possibly conflicting edits from collaborators. In addition to safely merging concurrent non-conflicting edits, a DVCS extensively tracks source code provenance to help programmers contextualize and resolve conflicts. Provenance also facilitates debugging by letting programmers see diffs between versions and quickly find those edits that introduced the offending conflict (e.g., via git blame). In this paper, we posit that analogous workflows to collaborative software development also arise in distributed software execution; we argue that the characteristics that make a DVCS an ideal fit for the former also make it an ideal fit for the latter. Building on this observation, we propose a distributed programming model, called carmot that views distributed shared state as an entity evolving in time, manifested as a sequence of persistent versions, and relies on an explicitly defined merge semantics to reconcile concurrent conflicting versions. We show examples demonstrating how carmot simplifies distributed programming, while also enabling novel workflows integral to modern applications such as blockchains. We also describe a prototype implementation of carmot that we use to evaluate its practicality

    Experimental analysis of computer system dependability

    Get PDF
    This paper reviews an area which has evolved over the past 15 years: experimental analysis of computer system dependability. Methodologies and advances are discussed for three basic approaches used in the area: simulated fault injection, physical fault injection, and measurement-based analysis. The three approaches are suited, respectively, to dependability evaluation in the three phases of a system's life: design phase, prototype phase, and operational phase. Before the discussion of these phases, several statistical techniques used in the area are introduced. For each phase, a classification of research methods or study topics is outlined, followed by discussion of these methods or topics as well as representative studies. The statistical techniques introduced include the estimation of parameters and confidence intervals, probability distribution characterization, and several multivariate analysis methods. Importance sampling, a statistical technique used to accelerate Monte Carlo simulation, is also introduced. The discussion of simulated fault injection covers electrical-level, logic-level, and function-level fault injection methods as well as representative simulation environments such as FOCUS and DEPEND. The discussion of physical fault injection covers hardware, software, and radiation fault injection methods as well as several software and hybrid tools including FIAT, FERARI, HYBRID, and FINE. The discussion of measurement-based analysis covers measurement and data processing techniques, basic error characterization, dependency analysis, Markov reward modeling, software-dependability, and fault diagnosis. The discussion involves several important issues studies in the area, including fault models, fast simulation techniques, workload/failure dependency, correlated failures, and software fault tolerance

    Design, Implementation, and Verification of the Reliable Multicast Protocol

    Get PDF
    This document describes the Reliable Multicast Protocol (RMP) design, first implementation, and formal verification. RMP provides a totally ordered, reliable, atomic multicast service on top of an unreliable multicast datagram service. RMP is fully and symmetrically distributed so that no site bears an undue portion of the communications load. RMP provides a wide range of guarantees, from unreliable delivery to totally ordered delivery, to K-resilient, majority resilient, and totally resilient atomic delivery. These guarantees are selectable on a per message basis. RMP provides many communication options, including virtual synchrony, a publisher/subscriber model of message delivery, a client/server model of delivery, mutually exclusive handlers for messages, and mutually exclusive locks. It has been commonly believed that total ordering of messages can only be achieved at great performance expense. RMP discounts this. The first implementation of RMP has been shown to provide high throughput performance on Local Area Networks (LAN). For two or more destinations a single LAN, RMP provides higher throughput than any other protocol that does not use multicast or broadcast technology. The design, implementation, and verification activities of RMP have occurred concurrently. This has allowed the verification to maintain a high fidelity between design model, implementation model, and the verification model. The restrictions of implementation have influenced the design earlier than in normal sequential approaches. The protocol as a whole has matured smoother by the inclusion of several different perspectives into the product development

    Virtual Runtime Application Partitions for Resource Management in Massively Parallel Architectures

    Get PDF
    This thesis presents a novel design paradigm, called Virtual Runtime Application Partitions (VRAP), to judiciously utilize the on-chip resources. As the dark silicon era approaches, where the power considerations will allow only a fraction chip to be powered on, judicious resource management will become a key consideration in future designs. Most of the works on resource management treat only the physical components (i.e. computation, communication, and memory blocks) as resources and manipulate the component to application mapping to optimize various parameters (e.g. energy efficiency). To further enhance the optimization potential, in addition to the physical resources we propose to manipulate abstract resources (i.e. voltage/frequency operating point, the fault-tolerance strength, the degree of parallelism, and the configuration architecture). The proposed framework (i.e. VRAP) encapsulates methods, algorithms, and hardware blocks to provide each application with the abstract resources tailored to its needs. To test the efficacy of this concept, we have developed three distinct self adaptive environments: (i) Private Operating Environment (POE), (ii) Private Reliability Environment (PRE), and (iii) Private Configuration Environment (PCE) that collectively ensure that each application meets its deadlines using minimal platform resources. In this work several novel architectural enhancements, algorithms and policies are presented to realize the virtual runtime application partitions efficiently. Considering the future design trends, we have chosen Coarse Grained Reconfigurable Architectures (CGRAs) and Network on Chips (NoCs) to test the feasibility of our approach. Specifically, we have chosen Dynamically Reconfigurable Resource Array (DRRA) and McNoC as the representative CGRA and NoC platforms. The proposed techniques are compared and evaluated using a variety of quantitative experiments. Synthesis and simulation results demonstrate VRAP significantly enhances the energy and power efficiency compared to state of the art.Siirretty Doriast

    Reversible Computation: Extending Horizons of Computing

    Get PDF
    This open access State-of-the-Art Survey presents the main recent scientific outcomes in the area of reversible computation, focusing on those that have emerged during COST Action IC1405 "Reversible Computation - Extending Horizons of Computing", a European research network that operated from May 2015 to April 2019. Reversible computation is a new paradigm that extends the traditional forwards-only mode of computation with the ability to execute in reverse, so that computation can run backwards as easily and naturally as forwards. It aims to deliver novel computing devices and software, and to enhance existing systems by equipping them with reversibility. There are many potential applications of reversible computation, including languages and software tools for reliable and recovery-oriented distributed systems and revolutionary reversible logic gates and circuits, but they can only be realized and have lasting effect if conceptual and firm theoretical foundations are established first
    • …
    corecore