4,350 research outputs found

    Energy-Efficient Management of Data Center Resources for Cloud Computing: A Vision, Architectural Elements, and Open Challenges

    Full text link
    Cloud computing is offering utility-oriented IT services to users worldwide. Based on a pay-as-you-go model, it enables hosting of pervasive applications from consumer, scientific, and business domains. However, data centers hosting Cloud applications consume huge amounts of energy, contributing to high operational costs and carbon footprints to the environment. Therefore, we need Green Cloud computing solutions that can not only save energy for the environment but also reduce operational costs. This paper presents vision, challenges, and architectural elements for energy-efficient management of Cloud computing environments. We focus on the development of dynamic resource provisioning and allocation algorithms that consider the synergy between various data center infrastructures (i.e., the hardware, power units, cooling and software), and holistically work to boost data center energy efficiency and performance. In particular, this paper proposes (a) architectural principles for energy-efficient management of Clouds; (b) energy-efficient resource allocation policies and scheduling algorithms considering quality-of-service expectations, and devices power usage characteristics; and (c) a novel software technology for energy-efficient management of Clouds. We have validated our approach by conducting a set of rigorous performance evaluation study using the CloudSim toolkit. The results demonstrate that Cloud computing model has immense potential as it offers significant performance gains as regards to response time and cost saving under dynamic workload scenarios.Comment: 12 pages, 5 figures,Proceedings of the 2010 International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA 2010), Las Vegas, USA, July 12-15, 201

    Software-Defined Cloud Computing: Architectural Elements and Open Challenges

    Full text link
    The variety of existing cloud services creates a challenge for service providers to enforce reasonable Software Level Agreements (SLA) stating the Quality of Service (QoS) and penalties in case QoS is not achieved. To avoid such penalties at the same time that the infrastructure operates with minimum energy and resource wastage, constant monitoring and adaptation of the infrastructure is needed. We refer to Software-Defined Cloud Computing, or simply Software-Defined Clouds (SDC), as an approach for automating the process of optimal cloud configuration by extending virtualization concept to all resources in a data center. An SDC enables easy reconfiguration and adaptation of physical resources in a cloud infrastructure, to better accommodate the demand on QoS through a software that can describe and manage various aspects comprising the cloud environment. In this paper, we present an architecture for SDCs on data centers with emphasis on mobile cloud applications. We present an evaluation, showcasing the potential of SDC in two use cases-QoS-aware bandwidth allocation and bandwidth-aware, energy-efficient VM placement-and discuss the research challenges and opportunities in this emerging area.Comment: Keynote Paper, 3rd International Conference on Advances in Computing, Communications and Informatics (ICACCI 2014), September 24-27, 2014, Delhi, Indi

    Autonomic management of virtualized resources in cloud computing

    Get PDF
    The last five years have witnessed a rapid growth of cloud computing in business, governmental and educational IT deployment. The success of cloud services depends critically on the effective management of virtualized resources. A key requirement of cloud management is the ability to dynamically match resource allocations to actual demands, To this end, we aim to design and implement a cloud resource management mechanism that manages underlying complexity, automates resource provisioning and controls client-perceived quality of service (QoS) while still achieving resource efficiency. The design of an automatic resource management centers on two questions: when to adjust resource allocations and how much to adjust. In a cloud, applications have different definitions on capacity and cloud dynamics makes it difficult to determine a static resource to performance relationship. In this dissertation, we have proposed a generic metric that measures application capacity, designed model-independent and adaptive approaches to manage resources and built a cloud management system scalable to a cluster of machines. To understand web system capacity, we propose to use a metric of productivity index (PI), which is defined as the ratio of yield to cost, to measure the system processing capability online. PI is a generic concept that can be applied to different levels to monitor system progress in order to identify if more capacity is needed. We applied the concept of PI to the problem of overload prevention in multi-tier websites. The overload predictor built on the PI metric shows more accurate and responsive overload prevention compared to conventional approaches. To address the issue of the lack of accurate server model, we propose a model-independent fuzzy control based approach for CPU allocation. For adaptive and stable control performance, we embed the controller with self-tuning output amplification and flexible rule selection. Finally, we build a QoS provisioning framework that supports multi-objective QoS control and service differentiation. Experiments on a virtual cluster with two service classes show the effectiveness of our approach in both performance and power control. To address the problems of complex interplay between resources and process delays in fine-grained multi-resource allocation, we consider capacity management as a decision-making problem and employ reinforcement learning (RL) to optimize the process. The optimization depends on the trial-and-error interactions with the cloud system. In order to improve the initial management performance, we propose a model-based RL algorithm. The neural network based environment model, which is learned from previous management history, generates simulated resource allocations for the RL agent. Experiment results on heterogeneous applications show that our approach makes efficient use of limited interactions and find near optimal resource configurations within 7 steps. Finally, we present a distributed reinforcement learning approach to the cluster-wide cloud resource management. We decompose the cluster-wide resource allocation problem into sub-problems concerning individual VM resource configurations. The cluster-wide allocation is optimized if individual VMs meet their SLA with a high resource utilization. For scalability, we develop an efficient reinforcement learning approach with continuous state space. For adaptability, we use VM low-level runtime statistics to accommodate workload dynamics. Prototyped in a iBalloon system, the distributed learning approach successfully manages 128 VMs on a 16-node close correlated cluster

    Model-driven Scheduling for Distributed Stream Processing Systems

    Full text link
    Distributed Stream Processing frameworks are being commonly used with the evolution of Internet of Things(IoT). These frameworks are designed to adapt to the dynamic input message rate by scaling in/out.Apache Storm, originally developed by Twitter is a widely used stream processing engine while others includes Flink, Spark streaming. For running the streaming applications successfully there is need to know the optimal resource requirement, as over-estimation of resources adds extra cost.So we need some strategy to come up with the optimal resource requirement for a given streaming application. In this article, we propose a model-driven approach for scheduling streaming applications that effectively utilizes a priori knowledge of the applications to provide predictable scheduling behavior. Specifically, we use application performance models to offer reliable estimates of the resource allocation required. Further, this intuition also drives resource mapping, and helps narrow the estimated and actual dataflow performance and resource utilization. Together, this model-driven scheduling approach gives a predictable application performance and resource utilization behavior for executing a given DSPS application at a target input stream rate on distributed resources.Comment: 54 page

    Effective Resource and Workload Management in Data Centers

    Get PDF
    The increasing demand for storage, computation, and business continuity has driven the growth of data centers. Managing data centers efficiently is a difficult task because of the wide variety of datacenter applications, their ever-changing intensities, and the fact that application performance targets may differ widely. Server virtualization has been a game-changing technology for IT, providing the possibility to support multiple virtual machines (VMs) simultaneously. This dissertation focuses on how virtualization technologies can be utilized to develop new tools for maintaining high resource utilization, for achieving high application performance, and for reducing the cost of data center management.;For multi-tiered applications, bursty workload traffic can significantly deteriorate performance. This dissertation proposes an admission control algorithm AWAIT, for handling overloading conditions in multi-tier web services. AWAIT places on hold requests of accepted sessions and refuses to admit new sessions when the system is in a sudden workload surge. to meet the service-level objective, AWAIT serves the requests in the blocking queue with high priority. The size of the queue is dynamically determined according to the workload burstiness.;Many admission control policies are triggered by instantaneous measurements of system resource usage, e.g., CPU utilization. This dissertation first demonstrates that directly measuring virtual machine resource utilizations with standard tools cannot always lead to accurate estimates. A directed factor graph (DFG) model is defined to model the dependencies among multiple types of resources across physical and virtual layers.;Virtualized data centers always enable sharing of resources among hosted applications for achieving high resource utilization. However, it is difficult to satisfy application SLOs on a shared infrastructure, as application workloads patterns change over time. AppRM, an automated management system not only allocates right amount of resources to applications for their performance target but also adjusts to dynamic workloads using an adaptive model.;Server consolidation is one of the key applications of server virtualization. This dissertation proposes a VM consolidation mechanism, first by extending the fair load balancing scheme for multi-dimensional vector scheduling, and then by using a queueing network model to capture the service contentions for a particular virtual machine placement

    TrIMS: Transparent and Isolated Model Sharing for Low Latency Deep LearningInference in Function as a Service Environments

    Full text link
    Deep neural networks (DNNs) have become core computation components within low latency Function as a Service (FaaS) prediction pipelines: including image recognition, object detection, natural language processing, speech synthesis, and personalized recommendation pipelines. Cloud computing, as the de-facto backbone of modern computing infrastructure for both enterprise and consumer applications, has to be able to handle user-defined pipelines of diverse DNN inference workloads while maintaining isolation and latency guarantees, and minimizing resource waste. The current solution for guaranteeing isolation within FaaS is suboptimal -- suffering from "cold start" latency. A major cause of such inefficiency is the need to move large amount of model data within and across servers. We propose TrIMS as a novel solution to address these issues. Our proposed solution consists of a persistent model store across the GPU, CPU, local storage, and cloud storage hierarchy, an efficient resource management layer that provides isolation, and a succinct set of application APIs and container technologies for easy and transparent integration with FaaS, Deep Learning (DL) frameworks, and user code. We demonstrate our solution by interfacing TrIMS with the Apache MXNet framework and demonstrate up to 24x speedup in latency for image classification models and up to 210x speedup for large models. We achieve up to 8x system throughput improvement.Comment: In Proceedings CLOUD 201

    Autonomous management of cost, performance, and resource uncertainty for migration of applications to infrastructure-as-a-service (IaaS) clouds

    Get PDF
    2014 Fall.Includes bibliographical references.Infrastructure-as-a-Service (IaaS) clouds abstract physical hardware to provide computing resources on demand as a software service. This abstraction leads to the simplistic view that computing resources are homogeneous and infinite scaling potential exists to easily resolve all performance challenges. Adoption of cloud computing, in practice however, presents many resource management challenges forcing practitioners to balance cost and performance tradeoffs to successfully migrate applications. These challenges can be broken down into three primary concerns that involve determining what, where, and when infrastructure should be provisioned. In this dissertation we address these challenges including: (1) performance variance from resource heterogeneity, virtualization overhead, and the plethora of vaguely defined resource types; (2) virtual machine (VM) placement, component composition, service isolation, provisioning variation, and resource contention for multitenancy; and (3) dynamic scaling and resource elasticity to alleviate performance bottlenecks. These resource management challenges are addressed through the development and evaluation of autonomous algorithms and methodologies that result in demonstrably better performance and lower monetary costs for application deployments to both public and private IaaS clouds. This dissertation makes three primary contributions to advance cloud infrastructure management for application hosting. First, it includes design of resource utilization models based on step-wise multiple linear regression and artificial neural networks that support prediction of better performing component compositions. The total number of possible compositions is governed by Bell's Number that results in a combinatorially explosive search space. Second, it includes algorithms to improve VM placements to mitigate resource heterogeneity and contention using a load-aware VM placement scheduler, and autonomous detection of under-performing VMs to spur replacement. Third, it describes a workload cost prediction methodology that harnesses regression models and heuristics to support determination of infrastructure alternatives that reduce hosting costs. Our methodology achieves infrastructure predictions with an average mean absolute error of only 0.3125 VMs for multiple workloads

    On the feasibility of collaborative green data center ecosystems

    Get PDF
    The increasing awareness of the impact of the IT sector on the environment, together with economic factors, have fueled many research efforts to reduce the energy expenditure of data centers. Recent work proposes to achieve additional energy savings by exploiting, in concert with customers, service workloads and to reduce data centers’ carbon footprints by adopting demand-response mechanisms between data centers and their energy providers. In this paper, we debate about the incentives that customers and data centers can have to adopt such measures and propose a new service type and pricing scheme that is economically attractive and technically realizable. Simulation results based on real measurements confirm that our scheme can achieve additional energy savings while preserving service performance and the interests of data centers and customers.Peer ReviewedPostprint (author's final draft
    corecore