Sender-based message logging is a new low-overhead mechanism for providing transparent fault-tolerance in distributed systems. It differs from conventional message logging mechanisms in that each message is logged in volatile memory on the machine from which the message is sent. Keeping the message log in the sender's local memory allows us to recover from a single failure at a time without the expense of synchronously logging each message to stable storage. The message log is then asynchronously written to stable storage, without delaying the computation, as part of the sender's periodic checkpoint. Maintaining the sender-based message log requires at most one extra network packet over non-fault-tolerant reliable message communication and imposes little additional synchronization delay. It can be applied transparently to existing distributed applications and does not require specialized hardware. It is currently being implemented on a network of SUN workstations
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.