2 research outputs found

    Automated planning for cloud service configurations

    Get PDF
    The declarative approach has been widely accepted as an appropriate way to manage configurations of large scale systems – the administrators describe the specification of the “desired” configuration state of the system, and the tool computes and executes the necessary actions to bring the system from its current state into this desired state. However, none of state-of-the-art declarative configuration tools make any guarantees about the order of the changes across the system involved in implementing configuration changes. This thesis presents a technique that addresses this issue – it uses the SFP language to allow administrators to specify the desired configuration state and the global constraints of the system, compiles the specified reconfiguration task into a classical planning problem, and then uses an automated planning technique to automatically generate the workflow. The execution of the workflow can bring the system into the desired state, while preserving the global constraints during configuration changes. This thesis also presents an alternative approach to deploy the configurations – the workflow is used to automatically choreograph a set of reactive agents which are capable to autonomously reconfigure a computing system into a specified desired state. The agent interactions are guaranteed to be deadlock/livelock free, can preserve pre-specified global constraints during their execution, and automatically maintain the desired state once it has been achieved (self-healing). We present the formal semantics of SFP language, the technique that compiles SFP reconfiguration tasks to classical planning problems, and the algorithms for automatic generation and execution of the reactive agent models. In addition, we also present the formal semantics of core subset of SmartFrog language which is the foundation of SFP. Moreover, we present a domain-independent technique to compile a planning problem with extended goals into a classical planning problem. As a proof of concept, the techniques have been implemented in a prototype configuration tool called Nuri, which has been used to configure typical use-cases in cloud environment. The experiment results demonstrate that the Nuri is capable of planning and deploying the configurations in a reasonable time, with guaranteed constraints on the system throughout reconfiguration process

    Supporting IT Service Fault Recovery with an Automated Planning Method

    Get PDF
    Despite advances in software and hardware technologies, faults are still inevitable in a highly-dependent, human-engineered and administrated IT environment. Given the critical role of IT services today, it is imperative that faults, having once occurred, have to be dealt with eciently and eeffectively to avoid or reduce the actual losses. Nevertheless, the complexities of current IT services, e.g., with regard to their scales, heterogeneity and highly dynamic infrastructures, make the recovery operation a challenging task for operators. Such complexities will eventually outgrow the human capability to manage them. Such diculty is augmented by the fact that there are few well-devised methods available to support fault recovery. To tackle this issue, this thesis aims at providing a computer-aided approach to assist operators with fault recovery planning and, consequently, to increase the eciency of recovery activities.We propose a generic framework based on the automated planning theory to generate plans for recoveries of IT services. At the heart of the framework is a planning component. Assisted by the other participants in the framework, the planning component aggregates the relevant information and computes recovery steps accordingly. The main idea behind the planning component is to sustain the planning operations with automated planning techniques, which is one of the research fields of articial intelligence. Provided with a general planning model, we show theoretically that the service fault recovery problem can be indeed solved by automated planning techniques. The relationship between a planning problem and a fault recovery problem is shown by means of reduction between these problems. After an extensive investigation, we choose a planning paradigm that based on Hierarchical Task Networks (HTN) as the guideline for the design of our main planning algorithm called H2MAP. To sustain the operation of the planner, a set of components revolving around the planning component is provided. These components are responsible for tasks such as translation between dierent knowledge formats, persistent storage of planning knowledge and communication with external systems. To ensure extendibility in our design, we apply dierent design patterns for the components. We sketch and discuss the technical aspects of implementations of the core components. Finally, as proof of the concept, the framework is instantiated to two distinguishing application scenarios
    corecore