333 research outputs found

    Prediction-based VM provisioning and admission control for multi-tier web applications

    Get PDF

    Performance-oriented Cloud Provisioning: Taxonomy and Survey

    Full text link
    Cloud computing is being viewed as the technology of today and the future. Through this paradigm, the customers gain access to shared computing resources located in remote data centers that are hosted by cloud providers (CP). This technology allows for provisioning of various resources such as virtual machines (VM), physical machines, processors, memory, network, storage and software as per the needs of customers. Application providers (AP), who are customers of the CP, deploy applications on the cloud infrastructure and then these applications are used by the end-users. To meet the fluctuating application workload demands, dynamic provisioning is essential and this article provides a detailed literature survey of dynamic provisioning within cloud systems with focus on application performance. The well-known types of provisioning and the associated problems are clearly and pictorially explained and the provisioning terminology is clarified. A very detailed and general cloud provisioning classification is presented, which views provisioning from different perspectives, aiding in understanding the process inside-out. Cloud dynamic provisioning is explained by considering resources, stakeholders, techniques, technologies, algorithms, problems, goals and more.Comment: 14 pages, 3 figures, 3 table

    Autonomic management of virtualized resources in cloud computing

    Get PDF
    The last five years have witnessed a rapid growth of cloud computing in business, governmental and educational IT deployment. The success of cloud services depends critically on the effective management of virtualized resources. A key requirement of cloud management is the ability to dynamically match resource allocations to actual demands, To this end, we aim to design and implement a cloud resource management mechanism that manages underlying complexity, automates resource provisioning and controls client-perceived quality of service (QoS) while still achieving resource efficiency. The design of an automatic resource management centers on two questions: when to adjust resource allocations and how much to adjust. In a cloud, applications have different definitions on capacity and cloud dynamics makes it difficult to determine a static resource to performance relationship. In this dissertation, we have proposed a generic metric that measures application capacity, designed model-independent and adaptive approaches to manage resources and built a cloud management system scalable to a cluster of machines. To understand web system capacity, we propose to use a metric of productivity index (PI), which is defined as the ratio of yield to cost, to measure the system processing capability online. PI is a generic concept that can be applied to different levels to monitor system progress in order to identify if more capacity is needed. We applied the concept of PI to the problem of overload prevention in multi-tier websites. The overload predictor built on the PI metric shows more accurate and responsive overload prevention compared to conventional approaches. To address the issue of the lack of accurate server model, we propose a model-independent fuzzy control based approach for CPU allocation. For adaptive and stable control performance, we embed the controller with self-tuning output amplification and flexible rule selection. Finally, we build a QoS provisioning framework that supports multi-objective QoS control and service differentiation. Experiments on a virtual cluster with two service classes show the effectiveness of our approach in both performance and power control. To address the problems of complex interplay between resources and process delays in fine-grained multi-resource allocation, we consider capacity management as a decision-making problem and employ reinforcement learning (RL) to optimize the process. The optimization depends on the trial-and-error interactions with the cloud system. In order to improve the initial management performance, we propose a model-based RL algorithm. The neural network based environment model, which is learned from previous management history, generates simulated resource allocations for the RL agent. Experiment results on heterogeneous applications show that our approach makes efficient use of limited interactions and find near optimal resource configurations within 7 steps. Finally, we present a distributed reinforcement learning approach to the cluster-wide cloud resource management. We decompose the cluster-wide resource allocation problem into sub-problems concerning individual VM resource configurations. The cluster-wide allocation is optimized if individual VMs meet their SLA with a high resource utilization. For scalability, we develop an efficient reinforcement learning approach with continuous state space. For adaptability, we use VM low-level runtime statistics to accommodate workload dynamics. Prototyped in a iBalloon system, the distributed learning approach successfully manages 128 VMs on a 16-node close correlated cluster

    DEPAS: A Decentralized Probabilistic Algorithm for Auto-Scaling

    Full text link
    The dynamic provisioning of virtualized resources offered by cloud computing infrastructures allows applications deployed in a cloud environment to automatically increase and decrease the amount of used resources. This capability is called auto-scaling and its main purpose is to automatically adjust the scale of the system that is running the application to satisfy the varying workload with minimum resource utilization. The need for auto-scaling is particularly important during workload peaks, in which applications may need to scale up to extremely large-scale systems. Both the research community and the main cloud providers have already developed auto-scaling solutions. However, most research solutions are centralized and not suitable for managing large-scale systems, moreover cloud providers' solutions are bound to the limitations of a specific provider in terms of resource prices, availability, reliability, and connectivity. In this paper we propose DEPAS, a decentralized probabilistic auto-scaling algorithm integrated into a P2P architecture that is cloud provider independent, thus allowing the auto-scaling of services over multiple cloud infrastructures at the same time. Our simulations, which are based on real service traces, show that our approach is capable of: (i) keeping the overall utilization of all the instantiated cloud resources in a target range, (ii) maintaining service response times close to the ones obtained using optimal centralized auto-scaling approaches.Comment: Submitted to Springer Computin

    Software-Defined Cloud Computing: Architectural Elements and Open Challenges

    Full text link
    The variety of existing cloud services creates a challenge for service providers to enforce reasonable Software Level Agreements (SLA) stating the Quality of Service (QoS) and penalties in case QoS is not achieved. To avoid such penalties at the same time that the infrastructure operates with minimum energy and resource wastage, constant monitoring and adaptation of the infrastructure is needed. We refer to Software-Defined Cloud Computing, or simply Software-Defined Clouds (SDC), as an approach for automating the process of optimal cloud configuration by extending virtualization concept to all resources in a data center. An SDC enables easy reconfiguration and adaptation of physical resources in a cloud infrastructure, to better accommodate the demand on QoS through a software that can describe and manage various aspects comprising the cloud environment. In this paper, we present an architecture for SDCs on data centers with emphasis on mobile cloud applications. We present an evaluation, showcasing the potential of SDC in two use cases-QoS-aware bandwidth allocation and bandwidth-aware, energy-efficient VM placement-and discuss the research challenges and opportunities in this emerging area.Comment: Keynote Paper, 3rd International Conference on Advances in Computing, Communications and Informatics (ICACCI 2014), September 24-27, 2014, Delhi, Indi

    Effective Resource and Workload Management in Data Centers

    Get PDF
    The increasing demand for storage, computation, and business continuity has driven the growth of data centers. Managing data centers efficiently is a difficult task because of the wide variety of datacenter applications, their ever-changing intensities, and the fact that application performance targets may differ widely. Server virtualization has been a game-changing technology for IT, providing the possibility to support multiple virtual machines (VMs) simultaneously. This dissertation focuses on how virtualization technologies can be utilized to develop new tools for maintaining high resource utilization, for achieving high application performance, and for reducing the cost of data center management.;For multi-tiered applications, bursty workload traffic can significantly deteriorate performance. This dissertation proposes an admission control algorithm AWAIT, for handling overloading conditions in multi-tier web services. AWAIT places on hold requests of accepted sessions and refuses to admit new sessions when the system is in a sudden workload surge. to meet the service-level objective, AWAIT serves the requests in the blocking queue with high priority. The size of the queue is dynamically determined according to the workload burstiness.;Many admission control policies are triggered by instantaneous measurements of system resource usage, e.g., CPU utilization. This dissertation first demonstrates that directly measuring virtual machine resource utilizations with standard tools cannot always lead to accurate estimates. A directed factor graph (DFG) model is defined to model the dependencies among multiple types of resources across physical and virtual layers.;Virtualized data centers always enable sharing of resources among hosted applications for achieving high resource utilization. However, it is difficult to satisfy application SLOs on a shared infrastructure, as application workloads patterns change over time. AppRM, an automated management system not only allocates right amount of resources to applications for their performance target but also adjusts to dynamic workloads using an adaptive model.;Server consolidation is one of the key applications of server virtualization. This dissertation proposes a VM consolidation mechanism, first by extending the fair load balancing scheme for multi-dimensional vector scheduling, and then by using a queueing network model to capture the service contentions for a particular virtual machine placement

    A Reliable and Cost-Efficient Auto-Scaling System for Web Applications Using Heterogeneous Spot Instances

    Full text link
    Cloud providers sell their idle capacity on markets through an auction-like mechanism to increase their return on investment. The instances sold in this way are called spot instances. In spite that spot instances are usually 90% cheaper than on-demand instances, they can be terminated by provider when their bidding prices are lower than market prices. Thus, they are largely used to provision fault-tolerant applications only. In this paper, we explore how to utilize spot instances to provision web applications, which are usually considered availability-critical. The idea is to take advantage of differences in price among various types of spot instances to reach both high availability and significant cost saving. We first propose a fault-tolerant model for web applications provisioned by spot instances. Based on that, we devise novel auto-scaling polices for hourly billed cloud markets. We implemented the proposed model and policies both on a simulation testbed for repeatable validation and Amazon EC2. The experiments on the simulation testbed and the real platform against the benchmarks show that the proposed approach can greatly reduce resource cost and still achieve satisfactory Quality of Service (QoS) in terms of response time and availability

    Snapshot Provisioning of Cloud Application Stacks to Face Traffic Surges

    Get PDF
    Traffic surges, like the Slashdot effect, occur when a web application is overloaded by a huge number of requests, potentially leading to unavailability. Unfortunately, such traffic variations are generally totally unplanned, of great amplitude, within a very short period, and a variable delay to return to a normal regime. In this report, we introduce PeakForecast as an elastic middleware solution to detect and absorb a traffic surge. In particular, PeakForecast can, from a trace of queries received in the last seconds, minutes or hours, to detect if the underlying system is facing a traffic surge or not, and then estimate the future traffic using a forecast model with an acceptable precision, thereby calculating the number of resources required to absorb the remaining traffic to come. We validate our solution by experimental results demonstrating that it can provide instantaneous elasticity of resources for traffic surges observed on the Japanese version of Wikipedia during the Fukushima Daiichi nuclear disaster in March 2011.Les pics de trafic, tels que l'effet Slashdot, apparaissent lorsqu'une application web doit faire face un nombre important de requêtes qui peut potentiellement entraîner une mise hors service de l'application. Malheureusement, de telles variations de traffic sont en général totalement imprévues et d'une grande amplitude, arrivent pendant une très courte période de temps et le retour à un régime normal prend un délai variable. Dans ce rapport, nous présentons PeakForecast qui est une solution intergicielle élastique pour détecter et absorber les pics de trafic. En particulier, PeakForecast peut, à partir des traces de requêtes reçues dans les dernières secondes, minutes ou heures, détecter si le système sous-jacent fait face ou non à un pic de trafic, estimer le trafic futur en utilisant un modèle de prédiction suffisamment précis, et calculer le nombre de ressources nécessaires à l'absorption du trafic restant à venir. Nous validons notre solution avec des résultats expérimentaux qui démontrent qu'elle fournit une élasticité instantanée des ressources pour des pics de trafic qui ont été observés sur la version japonaise de Wikipedia lors de l'accident nucléaire de Fukushima Daiichi en mars 2011

    Snapshot Provisioning of Cloud Application Stacks to Face Traffic Surges

    No full text
    Traffic surges, like the Slashdot effect, occur when a web application is overloaded by a huge number of requests, potentially leading to unavailability. Unfortunately, such traffic variations are generally totally unplanned, of great amplitude, within a very short period, and a variable delay to return to a normal regime. In this report, we introduce PeakForecast as an elastic middleware solution to detect and absorb a traffic surge. In particular, PeakForecast can, from a trace of queries received in the last seconds, minutes or hours, to detect if the underlying system is facing a traffic surge or not, and then estimate the future traffic using a forecast model with an acceptable precision, thereby calculating the number of resources required to absorb the remaining traffic to come. We validate our solution by experimental results demonstrating that it can provide instantaneous elasticity of resources for traffic surges observed on the Japanese version of Wikipedia during the Fukushima Daiichi nuclear disaster in March 2011.Les pics de trafic, tels que l'effet Slashdot, apparaissent lorsqu'une application web doit faire face un nombre important de requêtes qui peut potentiellement entraîner une mise hors service de l'application. Malheureusement, de telles variations de traffic sont en général totalement imprévues et d'une grande amplitude, arrivent pendant une très courte période de temps et le retour à un régime normal prend un délai variable. Dans ce rapport, nous présentons PeakForecast qui est une solution intergicielle élastique pour détecter et absorber les pics de trafic. En particulier, PeakForecast peut, à partir des traces de requêtes reçues dans les dernières secondes, minutes ou heures, détecter si le système sous-jacent fait face ou non à un pic de trafic, estimer le trafic futur en utilisant un modèle de prédiction suffisamment précis, et calculer le nombre de ressources nécessaires à l'absorption du trafic restant à venir. Nous validons notre solution avec des résultats expérimentaux qui démontrent qu'elle fournit une élasticité instantanée des ressources pour des pics de trafic qui ont été observés sur la version japonaise de Wikipedia lors de l'accident nucléaire de Fukushima Daiichi en mars 2011
    • …
    corecore