9 research outputs found

    Recovery within long running transactions

    Get PDF
    As computer systems continue to grow in complexity, the possibilities of failure increase. At the same time, the increase in computer system pervasiveness in day-to-day activities brought along increased expectations on their reliability. This has led to the need for effective and automatic error recovery techniques to resolve failures. Transactions enable the handling of failure propagation over concurrent systems due to dependencies, restoring the system to the point before the failure occurred. However, in various settings, especially when interacting with the real world, reversal is not possible. The notion of compensations has been long advocated as a way of addressing this issue, through the specification of activities which can be executed to undo partial transactions. Still, there is no accepted standard theory; the literature offers a plethora of distinct formalisms and approaches. In this survey, we review the compensations from a theoretical point of view by: (i) giving a historic account of the evolution of compensating transactions; (ii) delineating and describing a number of design options involved; (iii) presenting a number of formalisms found in the literature, exposing similarities and differences; (iv) comparing formal notions of compensation correctness; (v) giving insights regarding the application of compensations in practice; and (vi) discussing current and future research trends in the area.peer-reviewe

    Comprehensive Monitor-Oriented Compensation Programming

    Full text link
    Compensation programming is typically used in the programming of web service compositions whose correct implementation is crucial due to their handling of security-critical activities such as financial transactions. While traditional exception handling depends on the state of the system at the moment of failure, compensation programming is significantly more challenging and dynamic because it is dependent on the runtime execution flow - with the history of behaviour of the system at the moment of failure affecting how to apply compensation. To address this dynamic element, we propose the use of runtime monitors to facilitate compensation programming, with monitors enabling the modeller to be able to implicitly reason in terms of the runtime control flow, thus separating the concerns of system building and compensation modelling. Our approach is instantiated into an architecture and shown to be applicable to a case study.Comment: In Proceedings FESCA 2014, arXiv:1404.043

    Programming compensations for system-monitor synchronisation

    Get PDF
    In security-critical systems such as online establishments, runtime analysis is crucial to detect and handle any unexpected behaviour. Due to resource-intensive operations carried out by such systems, particularly during peak times, synchronous monitoring is not always an option. Asynchronous monitoring, on the other hand, would not compete for system resources but might detect anomalies when the system has progressed further, and it is already too late to apply a remedy. A conciliatory approach is to apply asynchronous monitoring but synchronising when there is a high risk of a problem arising. Although this does not solve the issue of problems arising when in asynchronous mode, compensations have been shown to be useful to restore the system to a sane state when this occurs. In this paper we propose a novel notation, compensating automata, which enables the user to program the compensation logic within the monitor, extending our earlier results by allowing for richer compensation structures. This approach moves the compensation closer to the violation information while simultaneously relieving the system of the additional burden.peer-reviewe

    Monitor-Oriented Compensation Programming Through Compensating Automata

    Get PDF
    Compensations have been used for decades in areas such as flow management systems, long-lived transactions and more recently in the service-oriented architecture. Since compensations enable the logical reversal of past actions, by their nature they crosscut other programming concerns. Thus, intertwining compensations with the rest of the system not only makes programs less well-structured, but also limits the expressivity of compensations due to the tight coupling with the system's behaviour.   To separate compensation concerns from the normal system behaviour, we propose compensating automata, a graphical specification language dedicated to compensation programming. Compensating automata are subsequently employed in a monitor-oriented fashion to program compensations without cluttering the actual system implementation. This approach is shown applicable to a complex case study which existing compensation approaches have difficulty handling

    ContextWorkflow: A Monadic DSL for Compensable and Interruptible Executions

    Get PDF
    Context-aware applications, whose behavior reactively depends on the time-varying status of the surrounding environment - such as network connection, battery level, and sensors - are getting more and more pervasive and important. The term "context-awareness" usually suggests prompt reactions to context changes: as the context change signals that the current execution cannot be continued, the application should immediately abort its execution, possibly does some clean-up tasks, and suspend until the context allows it to restart. Interruptions, or asynchronous exceptions, are useful to achieve context-awareness. It is, however, difficult to program with interruptions in a compositional way in most programming languages because their support is too primitive, relying on synchronous exception handling mechanism such as try-catch. We propose a new domain-specific language ContextWorkflow for interruptible programs as a solution to the problem. A basic unit of an interruptible program is a workflow, i.e., a sequence of atomic computations accompanied with compensation actions. The uniqueness of ContextWorkflow is that, during its execution, a workflow keeps watching the context between atomic actions and decides if the computation should be continued, aborted, or suspended. Our contribution of this paper is as follows; (1) the design of a workflow-like language with asynchronous interruption, checkpointing, sub-workflows and suspension; (2) a formal semantics of the core language; (3) a monadic interpreter corresponding to the semantics; and (4) its concrete implementation as an embedded domain-specific language in Scala

    A Survey of Challenges for Runtime Verification from Advanced Application Domains (Beyond Software)

    Get PDF
    Runtime verification is an area of formal methods that studies the dynamic analysis of execution traces against formal specifications. Typically, the two main activities in runtime verification efforts are the process of creating monitors from specifications, and the algorithms for the evaluation of traces against the generated monitors. Other activities involve the instrumentation of the system to generate the trace and the communication between the system under analysis and the monitor. Most of the applications in runtime verification have been focused on the dynamic analysis of software, even though there are many more potential applications to other computational devices and target systems. In this paper we present a collection of challenges for runtime verification extracted from concrete application domains, focusing on the difficulties that must be overcome to tackle these specific challenges. The computational models that characterize these domains require to devise new techniques beyond the current state of the art in runtime verification

    Recovery for sporadic operations on cloud applications

    Full text link
    Cloud-based systems get changed more frequently than traditional systems. These frequent changes involve sporadic operations such as installation and upgrade. Sporadic operations on cloud manipulate cloud resources and they are prone to unpredictable and inevitable failures largely due to cloud uncertainty. To recover from failures in sporadic operations on cloud, we need cloud operational recovery strategies. Existing operational recovery methods on cloud have several drawbacks, such as poor generalizability of the exception handling mechanism and the coarse-grained recovery manner of rollback mechanisms. Hence, this thesis proposes a novel and innovative recovery approach, called POD-Recovery, for sporadic operations on cloud. One novelty of POD-Recovery is that it is based on eight cloud operational recovery requirements formulated by us (e.g. recovery time objective satisfaction and recovery generalizability). Another novelty of POD-Recovery is that it is non-intrusive and does not modify the code which implements the sporadic operation. POD-Recovery works in the following innovative way: it first treats a sporadic operation as a process which provides the workflow of the operation and the contextual information for each operational step. Then, it identifies the recovery points (where failure detection and recovery should be performed) inside the sporadic operation, determines the unified resource space (the resource types required and manipulated by the sporadic operation), and generates the expected resource state templates (the abstraction level of resource states) for all operational steps. For a given recovery point inside the sporadic operation, POD-Recovery first filters the applicable recovery patterns from the eight recovery patterns it supports and then automatically generates the recovery actions for the applicable recovery patterns. Next, it evaluates the generated applicable recovery actions based on the metrics of Recovery Time, Recovery Cost and Recovery Impact. This quantitative evaluation leads to the selection of an acceptable recovery action for execution for a given recovery point. We implement POD-Recovery and evaluate it by recovering from faults injected into five representative types of sporadic operations on cloud. The experimental results show that POD-Recovery is able to perform operational recovery while satisfying all the recovery requirements and it improves on the existing recovery methods for cloud operations

    Long Running Transactions Within Enterprise Resource Planning Systems

    Get PDF
    Recently, one of the major problems in various countries is the management of complicated organisations to cope with the increasingly competitive marketplace. This problem can be solved using Enterprise Resource Planning (ERP) systems which can offer an integrated view of the whole business process within an organisation in real-time. However, those systems have complicated workflow, are costly to be analysed to manage the whole business process in those systems. Thus, Long Running Transaction (LRTs) models have been proposed as optimal solutions, which can be used to simplify the analysis of ERP systems workflow to manage the whole organiational process and ensure that completed transactions in a business process are not processed in any other process. Practically, LRTs models have various problems, such as the rollback and check-pointing activities. This led to the use of Communication Closed Layers (CCLs) for decomposing processes into layers to be analysed easily using sequential programs. Therefore, the purpose of this work is to develop an advanced approach to implement and analyse the workflow of an organisation in order to deal with failures in Long Running Transaction (LRTs) within Enterprise Resource Planning (ERP) systems using Communication Closed Layers (CCLs). Furthermore, it aims to examine the possible enhancements for the available methodology for ERP systems based on studying the LRT suitability and applicability to model the ERP workflows and offer simple and elegant constructs for implementing those complex and expensive ERP workflow systems. The implemented model in this thesis offers a solution for two main challenges; incompatibilities that result from the application of transitional transaction processing concepts to the ERP context and the complexity of ERP workflow. The first challenge is addressed based on offering new semantics to allow modelling of concepts, such as rollbacks and check-points through various constraints, while the second is addressed through the use of the Communication Closed Layer (CCL) approach. The implemented computational reconfigurable model of an ERP workflow system in this work is able to simulate real ERP workflow systems and allows obtaining more understanding of the use of ERP system in enterprise environments. Moreover, a case study is introduced to evaluate the application of the implemented model using three scenarios. The conducted evaluation stage explores the effectiveness of executable ERP computational models and offers a simple methodology that can be used to build those systems using novel approaches. Based on comparing the current model with two previous models, it can be concluded that the new model outperforms previous models based on benefiting from their features and solving their limitations which make them inappropriate to be used in the context of ERP workflow models
    corecore