23 research outputs found

    Towards Autonomic Service Provisioning Systems

    Full text link
    This paper discusses our experience in building SPIRE, an autonomic system for service provision. The architecture consists of a set of hosted Web Services subject to QoS constraints, and a certain number of servers used to run session-based traffic. Customers pay for having their jobs run, but require in turn certain quality guarantees: there are different SLAs specifying charges for running jobs and penalties for failing to meet promised performance metrics. The system is driven by an utility function, aiming at optimizing the average earned revenue per unit time. Demand and performance statistics are collected, while traffic parameters are estimated in order to make dynamic decisions concerning server allocation and admission control. Different utility functions are introduced and a number of experiments aiming at testing their performance are discussed. Results show that revenues can be dramatically improved by imposing suitable conditions for accepting incoming traffic; the proposed system performs well under different traffic settings, and it successfully adapts to changes in the operating environment.Comment: 11 pages, 9 Figures, http://www.wipo.int/pctdb/en/wo.jsp?WO=201002636

    Quantifying the Benefits of Resource Multiplexing in On-Demand Data Centers

    Get PDF
    On-demand data centers host multiple applications on server farms by dynamically provisioning resources in response to workload variations. The efficiency of such dynamic provisioning on the required server farm capacity is dependent on several factors — the granularity and frequency of reallocation, the number of applications being hosted, the amount of resource overprovisioning and the accuracy of workload prediction. In this paper, we quantify the effect of these factors on the multiplexing benefits achievable in an on-demand data center. Using traces of real e-commerce workloads, we demonstrate that the ability to allocate fractional server resources at fine time-scales of tens of seconds to a few minutes can increase the multiplexing benefits by 162-188% over coarsegrained reallocation. Our results also show that these benefits increase in the presence of large number of hosted applications as a result of high level of multiplexing. In addition, we demonstrate that such fine-grained multiplexing is achievable even in the presence of real-world (inaccurate) workload predictors and allows overprovisioning slack of nearly 35-70% over coarse-grained multiplexing

    A Rapid Testing Framework for a Mobile Cloud Infrastructure

    Get PDF
    Abstract—Mobile clouds such as network-connected vehicles and satellite clusters are an emerging class of systems that are extensions to traditional real-time embedded systems: they provide long-term mission platforms made up of dynamic clusters of heterogeneous hardware nodes communicating over ad hoc wireless networks. Besides the inherent complexities entailed by a distributed architecture, developing software and testing these systems is difficult due to a number of other reasons, including the mobile nature of such systems, which can require a model of the physical dynamics of the system for accurate simulation and testing. This paper describes a rapid development and testing framework for a distributed satellite system. Our solutions include a modeling language for configuring and specifying an application’s interaction with the middleware layer, a physics simulator integrated with hardware in the loop to provide the system’s physical dynamics and the integration of a network traffic tool to dynamically vary the network bandwidth based on the physical dynamics. I

    PseudoApp: Performance Prediction for Application Migration to Cloud

    Get PDF
    Abstract-To migrate an existing application to cloud, a user needs to estimate and compare the performance and resource consumption of the application running in different clouds, in order to select the best service provider and the right virtual machine size. However, it is prohibitively expensive to install a complex application in multiple new environments solely for the purpose of performance benchmarking. Performance modeling is more practical but the accuracy is limited by system factors that are hard to model. We propose a new technique called PseudoApp to address these challenges. Our solution creates a pseudo-application to mimic the resource consumption of a real application. A pseudo-application runs the same set of distributed components and executes the same sequence of system calls as those of the real application. By benchmarking a simple and easyto-install PseudoApp in different cloud environments, a user can accurately obtain the performance and resource consumption of the real application. We apply PseudoApp to Apache and TPC-W and find that PseudoApp accurately predicts their performance with 2-8% error in throughput

    Provisioning multi-tier cloud applications using statistical bounds on sojourn time

    Full text link
    In this paper we present a simple and effective approach for re-source provisioning to achieve a percentile bound on the end to end response time of a multi-tier application. We, at first, model the multi-tier application as an open tandem network of M/G/1-PS queues and develop a method that produces a near optimal appli-cation configuration, i.e, number of servers at each tier, to meet the percentile bound in a homogeneous server environment – using a single type of server. We then extend our solution to a K-server case and our technique demonstrates a good accuracy, independent of the variability of service-times. Our approach demonstrates a provisioning error of no more than 3 % compared to a 140 % worst case provisioning error obtained by techniques based on anM/M/1-FCFS queue model. In addition, we extend our approach to han-dle a heterogenous server environment, i.e., with multiple types of servers. We find that fewer high-capacity servers are preferable for high percentile provisioning. Finally, we extend our approach to account for the rental cost of each server-type and compute a cost efficient application configuration with savings of over 80%. We demonstrate the applicability of our approach in a real world sys-tem by employing it to provision the two tiers of the java implemen-tation of TPC-W – a multi-tier transactional web benchmark that represents an e-commerce web application, i.e. an online book-store

    Dynamic Resource Provisioning for an Interactive System

    Get PDF
    In a data centre, server clusters are typically used to provide the required processing capacity to provide acceptable response time performance to interactive applications. The workload of each application may be time-varying. Static allocation to meet peak demand is not an efficient usage of resources. Dynamic resource allocation, on the other hand, can result in efficient resource utilization while meeting the performance goals of individual applications. In this thesis, we develop a new interactive system model where the number of logon users changes over time. Our objective is to obtain results that can be used to guide dynamic resource allocation decisions. We obtain approximate analytic results for the response time distribution at steady state for our model. Using numerical examples, we show that these results are acceptable in terms of estimating the steady state probabilities of the number of logon users. We also show by comparison with simulation that our results are acceptable in estimating the response time distribution under a variety of dynamic resource allocation scenarios. More importantly, we show that our results are accurate in terms of predicting the minimum number of processor nodes required to meet the performance goal of an interaction application. Such information is valuable to resource provisioning and we discuss how our results can be used to guide dynamic resource allocation decisions

    Enabling virtualization technologies for enhanced cloud computing

    Get PDF
    Cloud Computing is a ubiquitous technology that offers various services for individual users, small businesses, as well as large scale organizations. Data-center owners maintain clusters of thousands of machines and lease out resources like CPU, memory, network bandwidth, and storage to clients. For organizations, cloud computing provides the means to offload server infrastructure and obtain resources on demand, which reduces setup costs as well as maintenance overheads. For individuals, cloud computing offers platforms, resources and services that would otherwise be unavailable to them. At the core of cloud computing are various virtualization technologies and the resulting Virtual Machines (VMs). Virtualization enables cloud providers to host multiple VMs on a single Physical Machine (PM). The hallmark of VMs is the inability of the end-user to distinguish them from actual PMs. VMs allow cloud owners such essential features as live migration, which is the process of moving a VM from one PM to another while the VM is running, for various reasons. Features of the cloud such as fault tolerance, geographical server placement, energy management, resource management, big data processing, parallel computing, etc. depend heavily on virtualization technologies. Improvements and breakthroughs in these technologies directly lead to introduction of new possibilities in the cloud. This thesis identifies and proposes innovations for such underlying VM technologies and tests their performance on a cluster of 16 machines with real world benchmarks. Specifically the issues of server load prediction, VM consolidation, live migration, and memory sharing are attempted. First, a unique VM resource load prediction mechanism based on Chaos Theory is introduced that predicts server workloads with high accuracy. Based on these predictions, VMs are dynamically and autonomously relocated to different PMs in the cluster in an attempt to conserve energy. Experimental evaluations with a prototype on real world data- center load traces show that up to 80% of the unused PMs can be freed up and repurposed, with Service Level Objective (SLO) violations as little as 3%. Second, issues in live migration of VMs are analyzed, based on which a new distributed approach is presented that allows network-efficient live migration of VMs. The approach amortizes the transfer of memory pages over the life of the VM, thus reducing network traffic during critical live migration. The prototype reduces network usage by up to 45% and lowers required time by up to 40% for live migration on various real-world loads. Finally, a memory sharing and management approach called ACE-M is demonstrated that enables VMs to share and utilize all the memory available in the cluster remotely. Along with predictions on network and memory, this approach allows VMs to run applications with memory requirements much higher than physically available locally. It is experimentally shown that ACE-M reduces the memory performance degradation by about 75% and achieves a 40% lower network response time for memory intensive VMs. A combination of these innovations to the virtualization technologies can minimize performance degradation of various VM attributes, which will ultimately lead to a better end-user experience

    Model-Based Dynamic Resource Management for Service Oriented Clouds

    Get PDF
    Cloud computing is a flexible platform for software as a service, as more and more applications are deployed on cloud. Major challenges in cloud include how to characterize the workload of the applications and how to manage the cloud resources efficiently by sharing them among many applications. The current state of the art considers a simplified model of the system, either ignoring the software components altogether or ignoring the relationship between individual software services. This thesis considers the following resource management problems for cloud-based service providers: (i) how to estimate the parameters of the current workload, (ii) how to meet Quality of Service (QoS) targets while minimizing infrastructure cost, (iii) how to allocate resources considering performance costs of virtual machine reconfigurations. To address the above problems, we propose a model-based feedback loop approach. The cloud infrastructure, the services, and the applications are modelled using Layered Queuing Models (LQM). These models are then optimized. Mathematical techniques are used to reduce the complexity of the models and address the scalability issues. The main contributions of this thesis are: (i) Extended Kalman Filter (EKF) based techniques improved by dynamic clustering for scalable estimation of workload parameters, (ii) combination of adaptive empirical models (tuned during runtime) and stepwise optimizations for improving the overall allocation performance, (iii) dynamic service placement algorithms that consider the cost of virtual machine reconfiguration
    corecore