1 research outputs found

    Supporting Customized Failure Models for Distributed Software

    No full text
    The cost of employing software fault-tolerance techniques in distributed systems is strongly related to the type of failures to be tolerated. For example, in terms of the amount of redundancy required and execution time, tolerating a processor crash is much cheaper than tolerating arbitrary (or Byzantine) failures. The tradeoff, of course, is that making stronger assumptions about failures lessens the degree of fault coverage provided by the system. This paper describes an approach to constructing configurable services for distributed systems that allows easy customization of the type of failures to tolerate. For example, using our approach, it is possible to configure custom services across a spectrum of possibilities, from a very efficient but unreliable server group that does not tolerate any failures, to a less efficient but reliable group that tolerates crash, omission, timing, or arbitrary failures. The approach is based on building configurable services as collections o..
    corecore