350 research outputs found

    A control theoretical view of cloud elasticity: taxonomy, survey and challenges

    Get PDF
    The lucrative features of cloud computing such as pay-as-you-go pricing model and dynamic resource provisioning (elasticity) attract clients to host their applications over the cloud to save up-front capital expenditure and to reduce the operational cost of the system. However, the efficient management of hired computational resources is a challenging task. Over the last decade, researchers and practitioners made use of various techniques to propose new methods to address cloud elasticity. Amongst many such techniques, control theory emerges as one of the popular methods to implement elasticity. A plethora of research has been undertaken on cloud elasticity including several review papers that summarise various aspects of elasticity. However, the scope of the existing review articles is broad and focused mostly on the high-level view of the overall research works rather than on the specific details of a particular implementation technique. While considering the importance, suitability and abundance of control theoretical approaches, this paper is a step forward towards a stand-alone review of control theoretic aspects of cloud elasticity. This paper provides a detailed taxonomy comprising of relevant attributes defining the following two perspectives, i.e., control-theory as an implementation technique as well as cloud elasticity as a target application domain. We carry out an exhaustive review of the literature by classifying the existing elasticity solutions using the attributes of control theoretic perspective. The summarized results are further presented by clustering them with respect to the type of control solutions, thus helping in comparison of the related control solutions. In last, a discussion summarizing the pros and cons of each type of control solutions are presented. This discussion is followed by the detail description of various open research challenges in the field

    Self-management for large-scale distributed systems

    Get PDF
    Autonomic computing aims at making computing systems self-managing by using autonomic managers in order to reduce obstacles caused by management complexity. This thesis presents results of research on self-management for large-scale distributed systems. This research was motivated by the increasing complexity of computing systems and their management. In the first part, we present our platform, called Niche, for programming self-managing component-based distributed applications. In our work on Niche, we have faced and addressed the following four challenges in achieving self-management in a dynamic environment characterized by volatile resources and high churn: resource discovery, robust and efficient sensing and actuation, management bottleneck, and scale. We present results of our research on addressing the above challenges. Niche implements the autonomic computing architecture, proposed by IBM, in a fully decentralized way. Niche supports a network-transparent view of the system architecture simplifying the design of distributed self-management. Niche provides a concise and expressive API for self-management. The implementation of the platform relies on the scalability and robustness of structured overlay networks. We proceed by presenting a methodology for designing the management part of a distributed self-managing application. We define design steps that include partitioning of management functions and orchestration of multiple autonomic managers. In the second part, we discuss robustness of management and data consistency, which are necessary in a distributed system. Dealing with the effect of churn on management increases the complexity of the management logic and thus makes its development time consuming and error prone. We propose the abstraction of Robust Management Elements, which are able to heal themselves under continuous churn. Our approach is based on replicating a management element using finite state machine replication with a reconfigurable replica set. Our algorithm automates the reconfiguration (migration) of the replica set in order to tolerate continuous churn. For data consistency, we propose a majority-based distributed key-value store supporting multiple consistency levels that is based on a peer-to-peer network. The store enables the tradeoff between high availability and data consistency. Using majority allows avoiding potential drawbacks of a master-based consistency control, namely, a single-point of failure and a potential performance bottleneck. In the third part, we investigate self-management for Cloud-based storage systems with the focus on elasticity control using elements of control theory and machine learning. We have conducted research on a number of different designs of an elasticity controller, including a State-Space feedback controller and a controller that combines feedback and feedforward control. We describe our experience in designing an elasticity controller for a Cloud-based key-value store using state-space model that enables to trade-off performance for cost. We describe the steps in designing an elasticity controller. We continue by presenting the design and evaluation of ElastMan, an elasticity controller for Cloud-based elastic key-value stores that combines feedforward and feedback control

    Toward sustainable data centers: a comprehensive energy management strategy

    Get PDF
    Data centers are major contributors to the emission of carbon dioxide to the atmosphere, and this contribution is expected to increase in the following years. This has encouraged the development of techniques to reduce the energy consumption and the environmental footprint of data centers. Whereas some of these techniques have succeeded to reduce the energy consumption of the hardware equipment of data centers (including IT, cooling, and power supply systems), we claim that sustainable data centers will be only possible if the problem is faced by means of a holistic approach that includes not only the aforementioned techniques but also intelligent and unifying solutions that enable a synergistic and energy-aware management of data centers. In this paper, we propose a comprehensive strategy to reduce the carbon footprint of data centers that uses the energy as a driver of their management procedures. In addition, we present a holistic management architecture for sustainable data centers that implements the aforementioned strategy, and we propose design guidelines to accomplish each step of the proposed strategy, referring to related achievements and enumerating the main challenges that must be still solved.Peer ReviewedPostprint (author's final draft

    Information Exchange Management as a Service for Network Function Virtualization Environments

    Get PDF
    The Internet landscape is gradually adopting new communication paradigms characterized by flexibility and adaptability to the resource constraints and service requirements, including network function virtualization (NFV), software-defined networks, and various virtualization and network slicing technologies. These approaches need to be realized from multiple management and network entities exchanging information between each other. We propose a novel information exchange management as a service facility as an extension to ETSI's NFV management and orchestration framework, namely the virtual infrastructure information service (VIS). VIS is characterized by the following properties: 1) it exhibits the dynamic characteristics of such network paradigms; 2) it supports information flow establishment, operation, and optimization; and 3) it provides a logically centralized control of the established information flows with respect to the diverse demands of the entities exchanging information elements. Our proposal addresses the information exchange management requirements of NFV environments and is information-model agnostic. This paper includes an experimental analysis of its main functional and non-functional characteristics

    Techniques for improving the scalability of data center networks

    Get PDF
    Data centers require highly scalable data and control planes for ensuring good performance of distributed applications. Along the data plane, network throughput and latency directly impact application performance metrics. This has led researchers to propose high bisection bandwidth network topologies based on multi-rooted trees for data center networks. However, such topologies require efficient traffic splitting algorithms to fully utilize all available bandwidth. Along the control plane, the centralized controller for software-defined networks presents new scalability challenges. The logically centralized controller needs to scale according to network demands. Also, since all services are implemented in the centralized controller, it should allow easy integration of different types of network services.^ In this dissertation, we propose techniques to address scalability challenges along the data and control planes of data center networks.^ Along the data plane, we propose a fine-grained trac splitting technique for data center networks organized as multi-rooted trees. Splitting individual flows can provide better load balance but is not preferred because of potential packet reordering that conventional wisdom suggests may negatively interact with TCP congestion control. We demonstrate that, due to symmetry of the network topology, TCP is able to tolerate the induced packet reordering and maintain a single estimate of RTT.^ Along the control plane, we design a scalable distributed SDN control plane architecture. We propose algorithms to evenly distribute the load among the controller nodes of the control plane. The algorithms evenly distribute the load by dynamically configuring the switch to controller node mapping and adding/removing controller nodes in response to changing traffic patterns. ^ Each SDN controller platform may have different performance characteristics. In such cases, it may be desirable to run different services on different controllers to match the controller performance characteristics with service requirements. To address this problem, we propose an architecture, FlowBricks, that allows network operators to compose an SDN control plane with services running on top of heterogeneous controller platforms

    Collaborative Policy-Based Autonomic Management in IaaS Clouds

    Get PDF
    With the increasing number of machines (either virtual or physical) in a computing environment, it is becoming harder to monitor and manage these resources. Relying on human administrators, even with tools, is expensive and the growing complexity makes management even harder. The alternative is to look for automated approaches that can monitor and manage computing resources in real time with no human intervention. One of the approaches to this problem is policy-based autonomic management. However, in large systems having one single autonomic manager to manage everything is almost impossible. Therefore, multiple autonomic managers will be needed and these will need to cooperate in the overall management. We propose a management model using multiple autonomic managers organized in a hierarchical fashion to monitor and manage the resources in a computing environment based on provided policies. We develop a communication protocol to facilitate collaboration between different autonomic managers, define the core operations of these managers and introduce algorithms to deal with their deployment and operation. We also introduce an approach for the inference of the communication messages from policies and develop several algorithms for joining and maintaining the management hierarchy. We propose a deployment system that can discover relevant resources in a computing environment automatically to facilitate the deployment of autonomic managers at different levels of a physical system. We then test our approach by implementing it in a small private Infrastructure-as-a-Service (IaaS) cloud and show how this collaboration of autonomic managers in a hierarchical way can help to adopt to high stress situations automatically and reduce the SLA violation rate without adding any new resource to the environment

    An Application-aware SDN Controller for Hybrid Optical-electrical DC Networks

    Get PDF
    The adoption of optical switching technologies in Data Centre Networks (DCNs) offers a solution for high speed traffic and energy efficiency in Data Centre (DC) operational management, enabling an easy scaling of DC infrastructures. Flexible, slotted allocation of optical resources is fundamental to efficiently support the dynamicity of DC traffic. In this context, the NEPHELE project proposes a Time Division Multiple Access approach for optical resource allocation, orchestrated through a Software Defined Networking controller which coordinates the DCN configuration based on real-time cloud application requests

    Towards the decentralized coordination of multiple self-adaptive systems

    Full text link
    When multiple self-adaptive systems share the same environment and have common goals, they may coordinate their adaptations at runtime to avoid conflicts and to satisfy their goals. There are two approaches to coordination. (1) Logically centralized, where a supervisor has complete control over the individual self-adaptive systems. Such approach is infeasible when the systems have different owners or administrative domains. (2) Logically decentralized, where coordination is achieved through direct interactions. Because the individual systems have control over the information they share, decentralized coordination accommodates multiple administrative domains. However, existing techniques do not account simultaneously for both local concerns, e.g., preferences, and shared concerns, e.g., conflicts, which may lead to goals not being achieved as expected. Our idea to address this shortcoming is to express both types of concerns within the same constraint optimization problem. We propose CoADAPT, a decentralized coordination technique introducing two types of constraints: preference constraints, expressing local concerns, and consistency constraints, expressing shared concerns. At runtime, the problem is solved in a decentralized way using distributed constraint optimization algorithms implemented by each self-adaptive system. As a first step in realizing CoADAPT, we focus in this work on the coordination of adaptation planning strategies, traditionally addressed only with centralized techniques. We show the feasibility of CoADAPT in an exemplar from cloud computing and analyze experimentally its scalability
    corecore