1,874 research outputs found

    Self-management for large-scale distributed systems

    Get PDF
    Autonomic computing aims at making computing systems self-managing by using autonomic managers in order to reduce obstacles caused by management complexity. This thesis presents results of research on self-management for large-scale distributed systems. This research was motivated by the increasing complexity of computing systems and their management. In the first part, we present our platform, called Niche, for programming self-managing component-based distributed applications. In our work on Niche, we have faced and addressed the following four challenges in achieving self-management in a dynamic environment characterized by volatile resources and high churn: resource discovery, robust and efficient sensing and actuation, management bottleneck, and scale. We present results of our research on addressing the above challenges. Niche implements the autonomic computing architecture, proposed by IBM, in a fully decentralized way. Niche supports a network-transparent view of the system architecture simplifying the design of distributed self-management. Niche provides a concise and expressive API for self-management. The implementation of the platform relies on the scalability and robustness of structured overlay networks. We proceed by presenting a methodology for designing the management part of a distributed self-managing application. We define design steps that include partitioning of management functions and orchestration of multiple autonomic managers. In the second part, we discuss robustness of management and data consistency, which are necessary in a distributed system. Dealing with the effect of churn on management increases the complexity of the management logic and thus makes its development time consuming and error prone. We propose the abstraction of Robust Management Elements, which are able to heal themselves under continuous churn. Our approach is based on replicating a management element using finite state machine replication with a reconfigurable replica set. Our algorithm automates the reconfiguration (migration) of the replica set in order to tolerate continuous churn. For data consistency, we propose a majority-based distributed key-value store supporting multiple consistency levels that is based on a peer-to-peer network. The store enables the tradeoff between high availability and data consistency. Using majority allows avoiding potential drawbacks of a master-based consistency control, namely, a single-point of failure and a potential performance bottleneck. In the third part, we investigate self-management for Cloud-based storage systems with the focus on elasticity control using elements of control theory and machine learning. We have conducted research on a number of different designs of an elasticity controller, including a State-Space feedback controller and a controller that combines feedback and feedforward control. We describe our experience in designing an elasticity controller for a Cloud-based key-value store using state-space model that enables to trade-off performance for cost. We describe the steps in designing an elasticity controller. We continue by presenting the design and evaluation of ElastMan, an elasticity controller for Cloud-based elastic key-value stores that combines feedforward and feedback control

    Collaborative Policy-Based Autonomic Management in IaaS Clouds

    Get PDF
    With the increasing number of machines (either virtual or physical) in a computing environment, it is becoming harder to monitor and manage these resources. Relying on human administrators, even with tools, is expensive and the growing complexity makes management even harder. The alternative is to look for automated approaches that can monitor and manage computing resources in real time with no human intervention. One of the approaches to this problem is policy-based autonomic management. However, in large systems having one single autonomic manager to manage everything is almost impossible. Therefore, multiple autonomic managers will be needed and these will need to cooperate in the overall management. We propose a management model using multiple autonomic managers organized in a hierarchical fashion to monitor and manage the resources in a computing environment based on provided policies. We develop a communication protocol to facilitate collaboration between different autonomic managers, define the core operations of these managers and introduce algorithms to deal with their deployment and operation. We also introduce an approach for the inference of the communication messages from policies and develop several algorithms for joining and maintaining the management hierarchy. We propose a deployment system that can discover relevant resources in a computing environment automatically to facilitate the deployment of autonomic managers at different levels of a physical system. We then test our approach by implementing it in a small private Infrastructure-as-a-Service (IaaS) cloud and show how this collaboration of autonomic managers in a hierarchical way can help to adopt to high stress situations automatically and reduce the SLA violation rate without adding any new resource to the environment

    Elastic Business Process Management: State of the Art and Open Challenges for BPM in the Cloud

    Full text link
    With the advent of cloud computing, organizations are nowadays able to react rapidly to changing demands for computational resources. Not only individual applications can be hosted on virtual cloud infrastructures, but also complete business processes. This allows the realization of so-called elastic processes, i.e., processes which are carried out using elastic cloud resources. Despite the manifold benefits of elastic processes, there is still a lack of solutions supporting them. In this paper, we identify the state of the art of elastic Business Process Management with a focus on infrastructural challenges. We conceptualize an architecture for an elastic Business Process Management System and discuss existing work on scheduling, resource allocation, monitoring, decentralized coordination, and state management for elastic processes. Furthermore, we present two representative elastic Business Process Management Systems which are intended to counter these challenges. Based on our findings, we identify open issues and outline possible research directions for the realization of elastic processes and elastic Business Process Management.Comment: Please cite as: S. Schulte, C. Janiesch, S. Venugopal, I. Weber, and P. Hoenisch (2015). Elastic Business Process Management: State of the Art and Open Challenges for BPM in the Cloud. Future Generation Computer Systems, Volume NN, Number N, NN-NN., http://dx.doi.org/10.1016/j.future.2014.09.00

    Towards an Autonomic and Distributed Device Management for the Internet of Things

    Get PDF
    Best paper award at the Doctoral Symposium of ICAC 2019International audienc

    Decentralized planning for self-adaptation in multi-cloud environment

    Get PDF
    The runtime management of Internet of Things (IoT) oriented applications deployed in multi-clouds is a complex issue due to the highly heterogeneous and dynamic execution environment. To effectively cope with such an environment, the cross-layer and multi-cloud effects should be taken into account and a decentralized self-adaptation is a promising solution to maintain and evolve the applications for quality assurance. An important issue to be tackled towards realizing this solution is the uncertainty effect of the adaptation, which may cause negative impact to the other layers or even clouds. In this paper, we tackle such an issue from the planning perspective, since an inappropriate planning strategy can fail the adaptation outcome. Therefore, we present an architectural model for decentralized self-adaptation to support the cross-layer and multi-cloud environment. We also propose a planning model and method to enable the decentralized decision making. The planning is formulated as a Reinforcement Learning problem and solved using the Q-learning algorithm. Through simulation experiments, we conduct a study to assess the effectiveness and sensitivity of the proposed planning approach. The results show that our approach can potentially reduce the negative impact on the cross-layer and multi-cloud environment

    Coordinating multi-site construction projects using federated clouds

    Get PDF
    The requirements imposed by AEC (Architecture/Engineering/Construction) projects with regard to data storage and execution, on-demand data sharing and complexity on building simulations have led to utilising novel computing techniques. In detail, these requirements refer to storing the large amounts of data that the AEC industry generates — from building schematics to associated data derived from different contractors that are involved at various stages of the building lifecycle; or running simulations on building models (such as energy efficiency, environmental impact & occupancy simulations). Creating such a computing infrastructure to support operations deriving from various AEC projects can be challenging due to the complexity of workflows, distributed nature of the data and diversity of roles, profiles and location of the users. Federated clouds have provided the means to create a distributed environment that can support multiple individuals and organisations to work collaboratively. In this study we present how multi-site construction projects can be coordinated by the use of federated clouds where the interacting parties are represented by AEC industry organisations. We show how coordination can support (a) data sharing and interoperability using a multi-vendor Cloud environment and (b) process interoperability based on various stakeholders involved in the AEC project lifecycle. We develop a framework that facilitates project coordination with associated “issue status” implications and validate our outcome in a real construction project

    Coordinated Autonomic Managers for Energy Efficient Date Centers

    Get PDF
    The complexity of today’s data centers has led researchers to investigate ways in which autonomic methods can be used for data center management. Autonomic managers try to monitor and manage resources to ensure that the components they manage are self-configuring, self-optimizing, self-healing and self-protecting (so called “self-*” properties). In this research, we consider autonomic management systems for data centers with a particular focus on making data centers more energy-aware. In particular, we consider a policy based, multi-manager autonomic management systems for energy aware data centers. Our focus is on defining the foundations – the core concepts, entities, relationships and algorithms - for autonomic management systems capable of supporting a range of management configurations. Central to our approach is the notion of a “topology” of autonomic managers that when instantiated can support a range of different configurations of autonomic managers and communication among them. The notion of “policy” is broadened to enable some autonomic managers to have more direct control over the behavior of other managers through changes in policies. The ultimate goal is to create a management framework that would allow the data center administrator to a) define managed objects, their corresponding managers, management system topology, and policies to meet their operation needs and b) rely on the management system to maintain itself automatically. A data center simulator that computes its energy consumption (computing and cooling) at any given time is implemented to evaluate the impact of different management scenarios. The management system is evaluated with different management scenarios in our simulated data center

    Towards the decentralized coordination of multiple self-adaptive systems

    Full text link
    When multiple self-adaptive systems share the same environment and have common goals, they may coordinate their adaptations at runtime to avoid conflicts and to satisfy their goals. There are two approaches to coordination. (1) Logically centralized, where a supervisor has complete control over the individual self-adaptive systems. Such approach is infeasible when the systems have different owners or administrative domains. (2) Logically decentralized, where coordination is achieved through direct interactions. Because the individual systems have control over the information they share, decentralized coordination accommodates multiple administrative domains. However, existing techniques do not account simultaneously for both local concerns, e.g., preferences, and shared concerns, e.g., conflicts, which may lead to goals not being achieved as expected. Our idea to address this shortcoming is to express both types of concerns within the same constraint optimization problem. We propose CoADAPT, a decentralized coordination technique introducing two types of constraints: preference constraints, expressing local concerns, and consistency constraints, expressing shared concerns. At runtime, the problem is solved in a decentralized way using distributed constraint optimization algorithms implemented by each self-adaptive system. As a first step in realizing CoADAPT, we focus in this work on the coordination of adaptation planning strategies, traditionally addressed only with centralized techniques. We show the feasibility of CoADAPT in an exemplar from cloud computing and analyze experimentally its scalability
    • …
    corecore