1,149 research outputs found

    Cloud-scale VM Deflation for Running Interactive Applications On Transient Servers

    Full text link
    Transient computing has become popular in public cloud environments for running delay-insensitive batch and data processing applications at low cost. Since transient cloud servers can be revoked at any time by the cloud provider, they are considered unsuitable for running interactive application such as web services. In this paper, we present VM deflation as an alternative mechanism to server preemption for reclaiming resources from transient cloud servers under resource pressure. Using real traces from top-tier cloud providers, we show the feasibility of using VM deflation as a resource reclamation mechanism for interactive applications in public clouds. We show how current hypervisor mechanisms can be used to implement VM deflation and present cluster deflation policies for resource management of transient and on-demand cloud VMs. Experimental evaluation of our deflation system on a Linux cluster shows that microservice-based applications can be deflated by up to 50\% with negligible performance overhead. Our cluster-level deflation policies allow overcommitment levels as high as 50\%, with less than a 1\% decrease in application throughput, and can enable cloud platforms to increase revenue by 30\%.Comment: To appear at ACM HPDC 202

    Resource management in a containerized cloud : status and challenges

    Get PDF
    Cloud computing heavily relies on virtualization, as with cloud computing virtual resources are typically leased to the consumer, for example as virtual machines. Efficient management of these virtual resources is of great importance, as it has a direct impact on both the scalability and the operational costs of the cloud environment. Recently, containers are gaining popularity as virtualization technology, due to the minimal overhead compared to traditional virtual machines and the offered portability. Traditional resource management strategies however are typically designed for the allocation and migration of virtual machines, so the question arises how these strategies can be adapted for the management of a containerized cloud. Apart from this, the cloud is also no longer limited to the centrally hosted data center infrastructure. New deployment models have gained maturity, such as fog and mobile edge computing, bringing the cloud closer to the end user. These models could also benefit from container technology, as the newly introduced devices often have limited hardware resources. In this survey, we provide an overview of the current state of the art regarding resource management within the broad sense of cloud computing, complementary to existing surveys in literature. We investigate how research is adapting to the recent evolutions within the cloud, being the adoption of container technology and the introduction of the fog computing conceptual model. Furthermore, we identify several challenges and possible opportunities for future research

    Virtual Organization Clusters: Self-Provisioned Clouds on the Grid

    Get PDF
    Virtual Organization Clusters (VOCs) provide a novel architecture for overlaying dedicated cluster systems on existing grid infrastructures. VOCs provide customized, homogeneous execution environments on a per-Virtual Organization basis, without the cost of physical cluster construction or the overhead of per-job containers. Administrative access and overlay network capabilities are granted to Virtual Organizations (VOs) that choose to implement VOC technology, while the system remains completely transparent to end users and non-participating VOs. Unlike alternative systems that require explicit leases, VOCs are autonomically self-provisioned according to configurable usage policies. As a grid computing architecture, VOCs are designed to be technology agnostic and are implementable by any combination of software and services that follows the Virtual Organization Cluster Model. As demonstrated through simulation testing and evaluation of an implemented prototype, VOCs are a viable mechanism for increasing end-user job compatibility on grid sites. On existing production grids, where jobs are frequently submitted to a small subset of sites and thus experience high queuing delays relative to average job length, the grid-wide addition of VOCs does not adversely affect mean job sojourn time. By load-balancing jobs among grid sites, VOCs can reduce the total amount of queuing on a grid to a level sufficient to counteract the performance overhead introduced by virtualization

    COSCO: container orchestration using co-simulation and gradient based optimization for fog computing environments

    Get PDF
    Intelligent task placement and management of tasks in large-scale fog platforms is challenging due to the highly volatile nature of modern workload applications and sensitive user requirements of low energy consumption and response time. Container orchestration platforms have emerged to alleviate this problem with prior art either using heuristics to quickly reach scheduling decisions or AI driven methods like reinforcement learning and evolutionary approaches to adapt to dynamic scenarios. The former often fail to quickly adapt in highly dynamic environments, whereas the latter have run-times that are slow enough to negatively impact response time. Therefore, there is a need for scheduling policies that are both reactive to work efficiently in volatile environments and have low scheduling overheads. To achieve this, we propose a Gradient Based Optimization Strategy using Back-propagation of gradients with respect to Input (GOBI). Further, we leverage the accuracy of predictive digital-twin models and simulation capabilities by developing a Coupled Simulation and Container Orchestration Framework (COSCO). Using this, we create a hybrid simulation driven decision approach, GOBI*, to optimize Quality of Service (QoS) parameters. Co-simulation and the back-propagation approaches allow these methods to adapt quickly in volatile environments. Experiments conducted using real-world data on fog applications using the GOBI and GOBI* methods, show a significant improvement in terms of energy consumption, response time, Service Level Objective and scheduling time by up to 15, 40, 4, and 82 percent respectively when compared to the state-of-the-art algorithms

    Utility-based Allocation of Resources to Virtual Machines in Cloud Computing

    Get PDF
    In recent years, cloud computing has gained a wide spread use as a new computing model that offers elastic resources on demand, in a pay-as-you-go fashion. One important goal of a cloud provider is dynamic allocation of Virtual Machines (VMs) according to workload changes in order to keep application performance to Service Level Agreement (SLA) levels, while reducing resource costs. The problem is to find an adequate trade-off between the two conflicting objectives of application performance and resource costs. In this dissertation, resource allocation solutions for this trade-off are proposed by expressing application performance and resource costs in a utility function. The proposed solutions allocate VM resources at the global data center level and at the local physical machine level by optimizing the utility function. The utility function, given as the difference between performance and costs, represents the profit of the cloud provider and offers the possibility to capture in a flexible and natural way the performance-cost trade-off. For global level resource allocation, a two-tier resource management solution is developed. In the first tier, local node controllers are located that dynamically allocate resource shares to VMs, so to maximize a local node utility function. In the second tier, there is a global controller that makes VM live migration decisions in order to maximize a global utility function. Experimental results show that optimizing the global utility function by changing the number of physical nodes according to workload maintains the performance at acceptable levels while reducing costs. To allocate multiple resources at the local physical machine level, a solution based on feed-back control theory and utility function optimization is proposed. This dynamically allocates shares to multiple resources of VMs such as CPU, memory, disk and network I/O bandwidth. In addressing the complex non-linearities that exist in shared virtualized infrastructures between VM performance and resource allocations, a solution is proposed that allocates VM resources to optimize a utility function based on application performance and power modelling. An Artificial Neural Network (ANN) is used to build an on- line model of the relationships between VM resource allocations and application performance, and another one between VM resource allocations and physical machine power. To cope with large utility optimization times in the case of an increased number of VMs, a distributed resource manager is proposed. It consists of several ANNs, each responsible for modelling and resource allocation of one VM, while exchanging information with other ANNs for coordinating resource allocations. Experiments, in simulated and realistic environments, show that the distributed ANN resource manager achieves better performance-power trade-offs than a centralized version and a distributed non-coordinated resource manager. To deal with the difficulty of building an accurate online application model and long model adaptation time, a solution that offers model-free resource management based on fuzzy control is proposed. It optimizes a utility function based on a hill-climbing search heuristic implemented as fuzzy rules. To cope with long utility optimization time in the case of an increased number of VMs, a multi-agent fuzzy controller is developed where each agent, in parallel with others, optimizes its own local utility function. The fuzzy control approach eliminates the need to build a model beforehand and provides a robust solution even for noisy measurements. Experimental results show that the multi-agent fuzzy controller performs better in terms of utility value than a centralized fuzzy control version and a state-of-the-art adaptive optimal control approach, especially for an increased number of VMs. Finally, to address some of the problems of reactive VM resource allocation approaches, a proactive resource allocation solution is proposed. This approach decides on VM resource allocations based on resource demand prediction, using a machine learning technique called Support Vector Machine (SVM). To deal with interdependencies between VMs of the same multi-tier application, cross- correlation demand prediction of multiple resource usage time series of all VMs of the multi-tier application is applied. As experiments show, this results in improved prediction accuracy and application performance

    Defining Service Level Agreements in Serverless Computing

    Get PDF
    The emergence of serverless computing has brought significant advancements to the delivery of computing resources to cloud users. With the abstraction of infrastructure, ecosystem, and execution environments, users could focus on their code while relying on the cloud provider to manage the abstracted layers. In addition, desirable features such as autoscaling and high availability became a provider’s responsibility and can be adopted by the user\u27s application at no extra overhead. Despite such advancements, significant challenges must be overcome as applications transition from monolithic stand-alone deployments to the ephemeral and stateless microservice model of serverless computing. These challenges pertain to the uniqueness of the conceptual and implementation models of serverless computing. One of the notable challenges is the complexity of defining Service Level Agreements (SLA) for serverless functions. As the serverless model shifts the administration of resources, ecosystem, and execution layers to the provider, users become mere consumers of the provider’s abstracted platform with no insight into its performance. Suboptimal conditions of the abstracted layers are not visible to the end-user who has no means to assess their performance. Thus, SLA in serverless computing must take into consideration the unique abstraction of its model. This work investigates the Service Level Agreement (SLA) modeling of serverless functions\u27 and serverless chains’ executions. We highlight how serverless SLA fundamentally differs from earlier cloud delivery models. We then propose an approach to define SLA for serverless functions by utilizing resource utilization fingerprints for functions\u27 executions and a method to assess if executions adhere to that SLA. We evaluate the approach’s accuracy in detecting SLA violations for a broad range of serverless application categories. Our validation results illustrate a high accuracy in detecting SLA violations resulting from resource contentions and provider’s ecosystem degradations. We conclude by presenting the empirical validation of our proposed approach, which could detect Execution-SLA violations with accuracy up to 99%

    Control over the Cloud : Offloading, Elastic Computing, and Predictive Control

    Get PDF
    The thesis studies the use of cloud native software and platforms to implement critical closed loop control. It considers technologies that provide low latency and reliable wireless communication, in terms of edge clouds and massive MIMO, but also approaches industrial IoT and the services of a distributed cloud, as an extension of commercial-of-the-shelf software and systems.First, the thesis defines the cloud control challenge, as control over the cloud and controller offloading. This is followed by a demonstration of closed loop control, using MPC, running on a testbed representing the distributed cloud.The testbed is implemented using an IoT device, clouds, next generation wireless technology, and a distributed execution platform. Platform details are provided and feasibility of the approach is shown. Evaluation includes relocating an on-line MPC to various locations in the distributed cloud. Offloaded control is examined next, through further evaluation of cloud native software and frameworks. This is followed by three controller designs, tailored for use with the cloud. The first controller solves MPC problems in parallel, to implement a variable horizon controller. The second is a hierarchical design, in which rate switching is used to implement constrained control, with a local and a remote mode. The third design focuses on reliability. Here, the MPC problem is extended to include recovery paths that represent a fallback mode. This is used by a control client if it experiences connectivity issues.An implementation is detailed and examined.In the final part of the thesis, the focus is on latency and congestion. A cloud control client can experience long and variable delays, from network and computations, and used services can become overloaded. These problems are approached by using predicted control inputs, dynamically adjusting the control frequency, and using horizontal scaling of the cloud service. Several examples are shown through simulation and on real clouds, including admitting control clients into a cluster that becomes temporarily overloaded

    Energy and performance-aware scheduling and shut-down models for efficient cloud-computing data centers.

    Get PDF
    This Doctoral Dissertation, presented as a set of research contributions, focuses on resource efficiency in data centers. This topic has been faced mainly by the development of several energy-efficiency, resource managing and scheduling policies, as well as the simulation tools required to test them in realistic cloud computing environments. Several models have been implemented in order to minimize energy consumption in Cloud Computing environments. Among them: a) Fifteen probabilistic and deterministic energy-policies which shut-down idle machines; b) Five energy-aware scheduling algorithms, including several genetic algorithm models; c) A Stackelberg game-based strategy which models the concurrency between opposite requirements of Cloud-Computing systems in order to dynamically apply the most optimal scheduling algorithms and energy-efficiency policies depending on the environment; and d) A productive analysis on the resource efficiency of several realistic cloud–computing environments. A novel simulation tool called SCORE, able to simulate several data-center sizes, machine heterogeneity, security levels, workload composition and patterns, scheduling strategies and energy-efficiency strategies, was developed in order to test these strategies in large-scale cloud-computing clusters. As results, more than fifty Key Performance Indicators (KPI) show that more than 20% of energy consumption can be reduced in realistic high-utilization environments when proper policies are employed.Esta Tesis Doctoral, que se presenta como compendio de artículos de investigación, se centra en la eficiencia en la utilización de los recursos en centros de datos de internet. Este problema ha sido abordado esencialmente desarrollando diferentes estrategias de eficiencia energética, gestión y distribución de recursos, así como todas las herramientas de simulación y análisis necesarias para su validación en entornos realistas de Cloud Computing. Numerosas estrategias han sido desarrolladas para minimizar el consumo energético en entornos de Cloud Computing. Entre ellos: 1. Quince políticas de eficiencia energética, tanto probabilísticas como deterministas, que apagan máquinas en estado de espera siempre que sea posible; 2. Cinco algoritmos de distribución de tareas que tienen en cuenta el consumo energético, incluyendo varios modelos de algoritmos genéticos; 3. Una estrategia basada en la teoría de juegos de Stackelberg que modela la competición entre diferentes partes de los centros de datos que tienen objetivos encontrados. Este modelo aplica dinámicamente las estrategias de distribución de tareas y las políticas de eficiencia energética dependiendo de las características del entorno; y 4. Un análisis productivo sobre la eficiencia en la utilización de recursos en numerosos escenarios de Cloud Computing. Una nueva herramienta de simulación llamada SCORE se ha desarrollado para analizar las estrategias antes mencionadas en clústers de Cloud Computing de grandes dimensiones. Los resultados obtenidos muestran que se puede conseguir un ahorro de energía superior al 20% en entornos realistas de alta utilización si se emplean las estrategias de eficiencia energética adecuadas. SCORE es open source y puede simular diferentes centros de datos con, entre otros muchos, los siguientes parámetros: Tamaño del centro de datos; heterogeneidad de los servidores; tipo, composición y patrones de carga de trabajo, estrategias de distribución de tareas y políticas de eficiencia energética, así como tres gestores de recursos centralizados: Monolítico, Two-level y Shared-state. Como resultados, esta herramienta de simulación arroja más de 50 Key Performance Indicators (KPI) de rendimiento general, de distribucin de tareas y de energía.Premio Extraordinario de Doctorado U

    Modeling and characterizing service interference in dynamic infrastructures

    Get PDF
    Performance interference can occur when various services are executed over the same physical infrastructure in a cloud system. This can lead to performance degradation compared to the execution of services in isolation. This work proposes a Confirmatory Factor Analysis (CFA)-based model to estimate performance interference across containers, caused by the use of CPU, memory and IO across a number of co-hosted applications. The approach provides resource characterization through human comprehensible indices expressed as time series, so the interference in the entire execution lifetime of a service can be analyzed. Our experiments, based on the combination of real services with different profiles executed in Docker containers, suggest that our model can accurately predict the overall execution time, for different service combinations. The approach can be used by a service designer to identify phases , during the execution life-cycle of a service, that are likely to lead to a greater degree of interference, and to ensure that only complementary services are hosted on the same physical machine. Interference-awareness of this kind will enable more intelligent resource management and scheduling for cloud systems, and may be used to dynamically modify scheduling decisions

    Energy Saving in QoS Fog-supported Data Centers

    Get PDF
    One of the most important challenges that cloud providers face in the explosive growth of data is to reduce the energy consumption of their designed, modern data centers. The majority of current research focuses on energy-efficient resources management in the infrastructure as a service (IaaS) model through "resources virtualization" - virtual machines and physical machines consolidation. However, actual virtualized data centers are not supporting communication–computing intensive real-time applications, big data stream computing (info-mobility applications, real-time video co-decoding). Indeed, imposing hard-limits on the overall per-job computing-plus-communication delays forces the overall networked computing infrastructure to quickly adopt its resource utilization to the (possibly, unpredictable and abrupt) time fluctuations of the offered workload. Recently, Fog Computing centers are as promising commodities in Internet virtual computing platform that raising the energy consumption and making the critical issues on such platform. Therefore, it is expected to present some green solutions (i.e., support energy provisioning) that cover fog-supported delay-sensitive web applications. Moreover, the usage of traffic engineering-based methods dynamically keep up the number of active servers to match the current workload. Therefore, it is desirable to develop a flexible, reliable technological paradigm and resource allocation algorithm to pay attention the consumed energy. Furthermore, these algorithms could automatically adapt themselves to time-varying workloads, joint reconfiguration, and orchestration of the virtualized computing-plus-communication resources available at the computing nodes. Besides, these methods facilitate things devices to operate under real-time constraints on the allowed computing-plus-communication delay and service latency. The purpose of this thesis is: i) to propose a novel technological paradigm, the Fog of Everything (FoE) paradigm, where we detail the main building blocks and services of the corresponding technological platform and protocol stack; ii) propose a dynamic and adaptive energy-aware algorithm that models and manages virtualized networked data centers Fog Nodes (FNs), to minimize the resulting networking-plus-computing average energy consumption; and, iii) propose a novel Software-as-a-Service (SaaS) Fog Computing platform to integrate the user applications over the FoE. The emerging utilization of SaaS Fog Computing centers as an Internet virtual computing commodity is to support delay-sensitive applications. The main blocks of the virtualized Fog node, operating at the Middleware layer of the underlying protocol stack and comprises of: i) admission control of the offered input traffic; ii) balanced control and dispatching of the admitted workload; iii) dynamic reconfiguration and consolidation of the Dynamic Voltage and Frequency Scaling (DVFS)-enabled Virtual Machines (VMs) instantiated onto the parallel computing platform; and, iv) rate control of the traffic injected into the TCP/IP connection. The salient features of this algorithm are that: i) it is adaptive and admits distributed scalable implementation; ii) it has the capacity to provide hard QoS guarantees, in terms of minimum/maximum instantaneous rate of the traffic delivered to the client, instantaneous goodput and total processing delay; and, iii) it explicitly accounts for the dynamic interaction between computing and networking resources in order to maximize the resulting energy efficiency. Actual performance of the proposed scheduler in the presence of: i) client mobility; ii) wireless fading; iii) reconfiguration and two-thresholds consolidation costs of the underlying networked computing platform; and, iv) abrupt changes of the transport quality of the available TCP/IP mobile connection, is numerically tested and compared to the corresponding ones of some state-of-the-art static schedulers, under both synthetically generated and measured real-world workload traces
    • …
    corecore