361 research outputs found

    Optimizing Cloud-Service Performance: Efficient Resource Provisioning Via Optimal Workload Allocation

    Get PDF
    Cloud computing is being widely accepted and utilized in the business world. From the perspective of businesses utilizing the cloud, it is critical to meet their customers\u27 requirements by achieving service-level-objectives. Hence, the ability to accurately characterize and optimize cloud-service performance is of great importance. In this dissertation, a stochastic multi-tenant framework is proposed to model the service of customer requests in a cloud infrastructure composed of heterogeneous virtual machines (VMs). The proposed framework addresses the critical concepts and characteristics in the cloud, including virtualization, multi-tenancy, heterogeneity of VMs, VM isolation for the purpose of security and/or performance guarantee and the stochastic response time of a customer request. Two cloud-service performance metrics are mathematically characterized, namely the percentile of the stochastic response time and the mean of the stochastic response time of a customer request. Based upon the proposed multi-tenant framework, a workload-allocation algorithm, termed max-min-cloud algorithm, is then devised to optimize the performance of the cloud service. A rigorous optimality proof of the max-min-cloud algorithm is given when the stochastic response time of a customer request assumed exponentially distributed. Furthermore, extensive Monte-Carlo simulations are conducted to validate the optimality of the max-min-cloud algorithm by comparing with other two workload-allocation algorithms under various scenarios. Next, the resource provisioning problem in the cloud is studied in light of the max-min-cloud algorithm. In particular, an efficient resource-provisioning strategy, termed the MPC strategy, is proposed for serving dynamically arriving customer requests. The efficacy of the MPC strategy is verified through two practical cases when the arrival of the customer requests is predictable and unpredictable, respectively. As an extension of the max-min-cloud algorithm, we further devise the max-load-first algorithm to deal with the VM placement problem in the cloud. MC simulation results show that the max-load-first VM-placement algorithm outperforms the other two heuristic algorithms in terms of reducing the mean of stochastic completion time of a group of arbitrary customers\u27 requests. Simulation results also provide insight on how the initial loads of servers affect the performance of the cloud system. In summary, the findings in this dissertation work can be of great benefit to both service providers (namely business owners) and cloud providers. For business owners, the max-min-cloud workload-allocation algorithm and the MPC resource-provisioning strategy together can be used help them build a better understanding of how much virtual resources in the cloud they may need to meet customers\u27 expectations subject to cost constraints. For cloud providers, the max-load-first VM-placement algorithm can be used to optimize the computational performance of the service by appropriately utilizing the physical machines and efficiently placing the VMs in their cloud infrastructures

    Reducing Electricity Demand Charge for Data Centers with Partial Execution

    Full text link
    Data centers consume a large amount of energy and incur substantial electricity cost. In this paper, we study the familiar problem of reducing data center energy cost with two new perspectives. First, we find, through an empirical study of contracts from electric utilities powering Google data centers, that demand charge per kW for the maximum power used is a major component of the total cost. Second, many services such as Web search tolerate partial execution of the requests because the response quality is a concave function of processing time. Data from Microsoft Bing search engine confirms this observation. We propose a simple idea of using partial execution to reduce the peak power demand and energy cost of data centers. We systematically study the problem of scheduling partial execution with stringent SLAs on response quality. For a single data center, we derive an optimal algorithm to solve the workload scheduling problem. In the case of multiple geo-distributed data centers, the demand of each data center is controlled by the request routing algorithm, which makes the problem much more involved. We decouple the two aspects, and develop a distributed optimization algorithm to solve the large-scale request routing problem. Trace-driven simulations show that partial execution reduces cost by 3%10.5%3\%--10.5\% for one data center, and by 15.5%15.5\% for geo-distributed data centers together with request routing.Comment: 12 page

    Allocation of Virtual Machines in Cloud Data Centers - A Survey of Problem Models and Optimization Algorithms

    Get PDF
    Data centers in public, private, and hybrid cloud settings make it possible to provision virtual machines (VMs) with unprecedented flexibility. However, purchasing, operating, and maintaining the underlying physical resources incurs significant monetary costs and also environmental impact. Therefore, cloud providers must optimize the usage of physical resources by a careful allocation of VMs to hosts, continuously balancing between the conflicting requirements on performance and operational costs. In recent years, several algorithms have been proposed for this important optimization problem. Unfortunately, the proposed approaches are hardly comparable because of subtle differences in the used problem models. This paper surveys the used problem formulations and optimization algorithms, highlighting their strengths and limitations, also pointing out the areas that need further research in the future

    Strategic and operational services for workload management in the cloud

    Full text link
    In hosting environments such as Infrastructure as a Service (IaaS) clouds, desirable application performance is typically guaranteed through the use of Service Level Agreements (SLAs), which specify minimal fractions of resource capacities that must be allocated by a service provider for unencumbered use by customers to ensure proper operation of their workloads. Most IaaS offerings are presented to customers as fixed-size and fixed-price SLAs, that do not match well the needs of specific applications. Furthermore, arbitrary colocation of applications with different SLAs may result in inefficient utilization of hosts' resources, resulting in economically undesirable customer behavior. In this thesis, we propose the design and architecture of a Colocation as a Service (CaaS) framework: a set of strategic and operational services that allow the efficient colocation of customer workloads. CaaS strategic services provide customers the means to specify their application workload using an SLA language that provides them the opportunity and incentive to take advantage of any tolerances they may have regarding the scheduling of their workloads. CaaS operational services provide the information necessary for, and carry out the reconfigurations mandated by strategic services. We recognize that it could be the case that there are multiple, yet functionally equivalent ways to express an SLA. Thus, towards that end, we present a service that allows the provably-safe transformation of SLAs from one form to another for the purpose of achieving more efficient colocation. Our CaaS framework could be incorporated into an IaaS offering by providers or it could be implemented as a value added proposition by IaaS resellers. To establish the practicality of such offerings, we present a prototype implementation of our proposed CaaS framework

    SLA-Driven Cloud Computing Domain Representation and Management

    Full text link
    The assurance of Quality of Service (QoS) to the applications, although identified as a key feature since long ago [1], is one of the fundamental challenges that remain unsolved. In the Cloud Computing context, Quality of Service is defined as the measure of the compliance of certain user requirement in the delivery of a cloud resource, such as CPU or memory load for a virtual machine, or more abstract and higher level concepts such as response time or availability. Several research groups, both from academia and industry, have started working on describing the QoS levels that define the conditions under which the service need to be delivered, as well as on developing the necessary means to effectively manage and evaluate the state of these conditions. [2] propose Service Level Agreements (SLAs) as the vehicle for the definition of QoS guarantees, and the provision and management of resources. A Service Level Agreement (SLA) is a formal contract between providers and consumers, which defines the quality of service, the obligations and the guarantees in the delivery of a specific good. In the context of Cloud computing, SLAs are considered to be machine readable documents, which are automatically managed by the provider's platform. SLAs need to be dynamically adapted to the variable conditions of resources and applications. In a multilayer architecture, different parts of an SLA may refer to different resources. SLAs may therefore express complex relationship between entities in a changing environment, and be applied to resource selection to implement intelligent scheduling algorithms. Therefore SLAs are widely regarded as a key feature for the future development of Cloud platforms. However, the application of SLAs for Grid and Cloud systems has many open research lines. One of these challenges, the modeling of the landscape, lies at the core of the objectives of the Ph. D. Thesis.García García, A. (2014). SLA-Driven Cloud Computing Domain Representation and Management [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/36579TESI

    Towards effective dynamic resource allocation for enterprise applications

    Get PDF
    The growing use of online services requires substantial supporting infrastructure. The efficient deployment of applications relies on the cost effectiveness of commercial hosting providers who deliver an agreed quality of service as governed by a service level agreement for a fee. The priorities of the commercial hosting provider are to maximise revenue, by delivering agreed service levels, and minimise costs, through high resource utilisation. In order to deliver high service levels and resource utilisation, it may be necessary to reorganise resources during periods of high demand. This reorganisation process may be manual or alternatively controlled by an autonomous process governed by a dynamic resource allocation algorithm. Dynamic resource allocation has been shown to improve service levels and utilisation and hence, profitability. In this thesis several facets of dynamic resource allocation are examined to asses its suitability for the modern data centre. Firstly, three theoretically derived policies are implemented as a middleware for a modern multi-tier Web application and their performance is examined under a range of workloads in a real world test bed. The scalability of state-of-the art resource allocation policies are explored in two dimensions, namely the number of applications and the quantity of servers under control of the resources allocation policy. The results demonstrate that current policies presented in the literature demonstrate poor scalability in one or both of these dimensions. A new policy is proposed which has significantly improved scalability characteristics and the new policy is demonstrated at scale through simulation. The placement of applications in across a datacenter makes them susceptible to failures in shared infrastructure. To address this issue an application placement mechanism is developed to augment any dynamic resource allocation policy. The results of this placement mechanism demonstrate a significant improvement in the worst case when compared to a random allocation mechanism. A model for the reallocation of resources in a dynamic resource allocation system is also devised. The model demonstrates that the assumption of a constant resource reallocation cost is invalid under both physical reallocation and migration of virtualised resources

    COSCO: container orchestration using co-simulation and gradient based optimization for fog computing environments

    Get PDF
    Intelligent task placement and management of tasks in large-scale fog platforms is challenging due to the highly volatile nature of modern workload applications and sensitive user requirements of low energy consumption and response time. Container orchestration platforms have emerged to alleviate this problem with prior art either using heuristics to quickly reach scheduling decisions or AI driven methods like reinforcement learning and evolutionary approaches to adapt to dynamic scenarios. The former often fail to quickly adapt in highly dynamic environments, whereas the latter have run-times that are slow enough to negatively impact response time. Therefore, there is a need for scheduling policies that are both reactive to work efficiently in volatile environments and have low scheduling overheads. To achieve this, we propose a Gradient Based Optimization Strategy using Back-propagation of gradients with respect to Input (GOBI). Further, we leverage the accuracy of predictive digital-twin models and simulation capabilities by developing a Coupled Simulation and Container Orchestration Framework (COSCO). Using this, we create a hybrid simulation driven decision approach, GOBI*, to optimize Quality of Service (QoS) parameters. Co-simulation and the back-propagation approaches allow these methods to adapt quickly in volatile environments. Experiments conducted using real-world data on fog applications using the GOBI and GOBI* methods, show a significant improvement in terms of energy consumption, response time, Service Level Objective and scheduling time by up to 15, 40, 4, and 82 percent respectively when compared to the state-of-the-art algorithms

    Cloud Workload Allocation Approaches for Quality of Service Guarantee and Cybersecurity Risk Management

    Get PDF
    It has become a dominant trend in industry to adopt cloud computing --thanks to its unique advantages in flexibility, scalability, elasticity and cost efficiency -- for providing online cloud services over the Internet using large-scale data centers. In the meantime, the relentless increase in demand for affordable and high-quality cloud-based services, for individuals and businesses, has led to tremendously high power consumption and operating expense and thus has posed pressing challenges on cloud service providers in finding efficient resource allocation policies. Allowing several services or Virtual Machines (VMs) to commonly share the cloud\u27s infrastructure enables cloud providers to optimize resource usage, power consumption, and operating expense. However, servers sharing among users and VMs causes performance degradation and results in cybersecurity risks. Consequently, how to develop efficient and effective resource management policies to make the appropriate decisions to optimize the trade-offs among resource usage, service quality, and cybersecurity loss plays a vital role in the sustainable future of cloud computing. In this dissertation, we focus on cloud workload allocation problems for resource optimization subject to Quality of Service (QoS) guarantee and cybersecurity risk constraints. To facilitate our research, we first develop a cloud computing prototype that we utilize to empirically validate the performance of different proposed cloud resource management schemes under a close to practical, but also isolated and well-controlled, environment. We then focus our research on the resource management policies for real-time cloud services with QoS guarantee. Based on queuing model with reneging, we establish and formally prove a series of fundamental principles, between service timing characteristics and their resource demands, and based on which we develop several novel resource management algorithms that statically guarantee the QoS requirements for cloud users. We then study the problem of mitigating cybersecurity risk and loss in cloud data centers via cloud resource management. We employ game theory to model the VM-to-VM interdependent cybersecurity risks in cloud clusters. We then conduct a thorough analysis based on our game-theory-based model and develop several algorithms for cybersecurity risk management. Specifically, we start our cybersecurity research from a simple case with only two types of VMs and next extend it to a more general case with an arbitrary number of VM types. Our intensive numerical and experimental results show that our proposed algorithms can significantly outperform the existing methodologies for large-scale cloud data centers in terms of resource usage, cybersecurity loss, and computational effectiveness
    corecore