5 research outputs found

    Group Communication in Amoeba and its Applications

    Get PDF
    Unlike many other operating systems, Amoeba is a distributed operating system that provides group communication (i.e., one-to-many communication). We wil

    Fault tolerant software technology for distributed computing system

    Get PDF
    Issued as Monthly reports [nos. 1-23], Interim technical report, Technical guide books [nos. 1-2], and Final report, Project no. G-36-64

    Determining the Last Process to Fail

    Full text link
    A total failure occurs whenever all processes cooperatively executing a distributed task fail before the task's completion. A frequent prerequisite for recovery from a total failure is the identification of the last group (LAST) of processes concurrently failing. Herein, we derive necessary and sufficient conditions for computing LAST from the local failure data of recovered processes. These conditions are easily translated into decision procedures for LAST membership using either complete or incomplete failure data. The choice of failure data itself is dictated by two requirements: (1) it can be cheaply maintained, and (2) maximum fault-tolerance is afforded in the sense that the expected number of recoveries required for identifying LAST is minimized

    Determining the last process to fail

    No full text
    corecore