research

Recovery within long running transactions

Abstract

As computer systems continue to grow in complexity, the possibilities of failure increase. At the same time, the increase in computer system pervasiveness in day-to-day activities brought along increased expectations on their reliability. This has led to the need for effective and automatic error recovery techniques to resolve failures. Transactions enable the handling of failure propagation over concurrent systems due to dependencies, restoring the system to the point before the failure occurred. However, in various settings, especially when interacting with the real world, reversal is not possible. The notion of compensations has been long advocated as a way of addressing this issue, through the specification of activities which can be executed to undo partial transactions. Still, there is no accepted standard theory; the literature offers a plethora of distinct formalisms and approaches. In this survey, we review the compensations from a theoretical point of view by: (i) giving a historic account of the evolution of compensating transactions; (ii) delineating and describing a number of design options involved; (iii) presenting a number of formalisms found in the literature, exposing similarities and differences; (iv) comparing formal notions of compensation correctness; (v) giving insights regarding the application of compensations in practice; and (vi) discussing current and future research trends in the area.peer-reviewe

    Similar works