656 research outputs found

    Snooze: A Scalable, Fault-Tolerant and Distributed Consolidation Manager for Large-Scale Clusters

    Get PDF
    International audienceIntelligent workload consolidation and dynamic cluster adaptation offer a great opportunity for energy savings in current large-scale clusters. Because of the heterogeneous nature of these environments, scalable, fault-tolerant and distributed consolidation managers are necessary in order to efficiently manage their workload and thus conserve energy and reduce the operating costs. However, most of the consolidation managers available nowadays do not fulfill these requirements. Hence, they are mostly centralized and solely designed to be operated in virtualized environments. In this work, we present the architecture of a novel scalable, fault-tolerant and distributed consolidation manager called Snooze that is able to dynamically consolidate the workload of a software and hardware heterogeneous large-scale cluster composed out of resources using the virtualization and Single System Image (SSI) technologies. Therefore, a common cluster monitoring and management API is introduced, which provides a uniform and transparent access to the features of the underlying platforms. Our architecture is open to support any future technologies and can be easily extended with monitoring metrics and algorithms. Finally, a comprehensive use case study demonstrates the feasibility of our approach to manage the energy consumption of a large-scale cluster

    A Literature Survey on Resource Management Techniques, Issues and Challenges in Cloud Computing

    Get PDF
    Cloud computing is a large scale distributed computing which provides on demand services for clients. Cloud Clients use web browsers, mobile apps, thin clients, or terminal emulators to request and control their cloud resources at any time and anywhere through the network. As many companies are shifting their data to cloud and as many people are being aware of the advantages of storing data to cloud, there is increasing number of cloud computing infrastructure and large amount of data which lead to the complexity management for cloud providers. We surveyed the state-of-the-art resource management techniques for IaaS (infrastructure as a service) in cloud computing. Then we put forward different major issues in the deployment of the cloud infrastructure in order to avoid poor service delivery in cloud computing

    Reliable and energy efficient resource provisioning in cloud computing systems

    Get PDF
    Cloud Computing has revolutionized the Information Technology sector by giving computing a perspective of service. The services of cloud computing can be accessed by users not knowing about the underlying system with easy-to-use portals. To provide such an abstract view, cloud computing systems have to perform many complex operations besides managing a large underlying infrastructure. Such complex operations confront service providers with many challenges such as security, sustainability, reliability, energy consumption and resource management. Among all the challenges, reliability and energy consumption are two key challenges focused on in this thesis because of their conflicting nature. Current solutions either focused on reliability techniques or energy efficiency methods. But it has been observed that mechanisms providing reliability in cloud computing systems can deteriorate the energy consumption. Adding backup resources and running replicated systems provide strong fault tolerance but also increase energy consumption. Reducing energy consumption by running resources on low power scaling levels or by reducing the number of active but idle sitting resources such as backup resources reduces the system reliability. This creates a critical trade-off between these two metrics that are investigated in this thesis. To address this problem, this thesis presents novel resource management policies which target the provisioning of best resources in terms of reliability and energy efficiency and allocate them to suitable virtual machines. A mathematical framework showing interplay between reliability and energy consumption is also proposed in this thesis. A formal method to calculate the finishing time of tasks running in a cloud computing environment impacted with independent and correlated failures is also provided. The proposed policies adopted various fault tolerance mechanisms while satisfying the constraints such as task deadlines and utility values. This thesis also provides a novel failure-aware VM consolidation method, which takes the failure characteristics of resources into consideration before performing VM consolidation. All the proposed resource management methods are evaluated by using real failure traces collected from various distributed computing sites. In order to perform the evaluation, a cloud computing framework, 'ReliableCloudSim' capable of simulating failure-prone cloud computing systems is developed. The key research findings and contributions of this thesis are: 1. If the emphasis is given only to energy optimization without considering reliability in a failure prone cloud computing environment, the results can be contrary to the intuitive expectations. Rather than reducing energy consumption, a system ends up consuming more energy due to the energy losses incurred because of failure overheads. 2. While performing VM consolidation in a failure prone cloud computing environment, a significant improvement in terms of energy efficiency and reliability can be achieved by considering failure characteristics of physical resources. 3. By considering correlated occurrence of failures during resource provisioning and VM allocation, the service downtime or interruption is reduced significantly by 34% in comparison to the environments with the assumption of independent occurrence of failures. Moreover, measured by our mathematical model, the ratio of reliability and energy consumption is improved by 14%

    CloudBench: an integrated evaluation of VM placement algorithms in clouds

    Get PDF
    A complex and important task in the cloud resource management is the efficient allocation of virtual machines (VMs), or containers, in physical machines (PMs). The evaluation of VM placement techniques in real-world clouds can be tedious, complex and time-consuming. This situation has motivated an increasing use of cloud simulators that facilitate this type of evaluations. However, most of the reported VM placement techniques based on simulations have been evaluated taking into account one specific cloud resource (e.g., CPU), whereas values often unrealistic are assumed for other resources (e.g., RAM, awaiting times, application workloads, etc.). This situation generates uncertainty, discouraging their implementations in real-world clouds. This paper introduces CloudBench, a methodology to facilitate the evaluation and deployment of VM placement strategies in private clouds. CloudBench considers the integration of a cloud simulator with a real-world private cloud. Two main tools were developed to support this methodology, a specialized multi-resource cloud simulator (CloudBalanSim), which is in charge of evaluating VM placement techniques, and a distributed resource manager (Balancer), which deploys and tests in a real-world private cloud the best VM placement configurations that satisfied user requirements defined in the simulator. Both tools generate feedback information, from the evaluation scenarios and their obtained results, which is used as a learning asset to carry out intelligent and faster evaluations. The experiments implemented with the CloudBench methodology showed encouraging results as a new strategy to evaluate and deploy VM placement algorithms in the cloud.This work was partially funded by the Spanish Ministry of Economy, Industry and Competitiveness under the Grant TIN2016-79637-P “Towards Unifcation of HPC and Big Data Paradigms” and by the Mexican Council of Science and Technology (CONACYT) through a Ph.D. Grant (No. 212677)

    Migration Control in Cloud Computing to Reduce the SLA Violation

    Get PDF
    The requisition of cloud based services are more eminent because of the enormous benefits of cloud such as pay-as-you-use flexibility,scalability and low upfront cost. Day-by-day due to growing number of cloud consumers the load on the datacenters is also increasing. Various load distribution and dynamic load balancing approaches are being followed in the datacenters to optimize the resource utilization so that the performance may be maintained during the increased load. Virtual machine (VM) migration is primarily used to implement dynamic load balancing in the datacenters. But, the poorly designed dynamic VM migration policies may negate its benefits. The VM migration overheads result in the violations of service level agreement (SLA) in the cloud environment.In this paper,an extended VM migration control model is proposedto minimize the SLA violations while controlling the energy consumption of the datacenter during VM migration. The parameters of execution boundary threshold is used to extend an existing VM migration control model. The proposed model is tested through extensive simulations using CloudSim toolkit by executing real world workload. Results are obtained in terms of number of SLA violations while controlling the energy consumption in the datacenter. Results show that the proposed modelachieves better performance in comparison to the existing model

    Energy-Efficient Load Balancing Algorithm for Workflow Scheduling in Cloud Data Centers Using Queuing and Thresholds

    Get PDF
    Cloud computing is a rapidly growing technology that has been implemented in various fields in recent years, such as business, research, industry, and computing. Cloud computing provides different services over the internet, thus eliminating the need for personalized hardware and other resources. Cloud computing environments face some challenges in terms of resource utilization, energy efficiency, heterogeneous resources, etc. Tasks scheduling and virtual machines (VMs) are used as consolidation techniques in order to tackle these issues. Tasks scheduling has been extensively studied in the literature. The problem has been studied with different parameters and objectives. In this article, we address the problem of energy consumption and efficient resource utilization in virtualized cloud data centers. The proposed algorithm is based on task classification and thresholds for efficient scheduling and better resource utilization. In the first phase, workflow tasks are pre-processed to avoid bottlenecks by placing tasks with more dependencies and long execution times in separate queues. In the next step, tasks are classified based on the intensities of the required resources. Finally, Particle Swarm Optimization (PSO) is used to select the best schedules. Experiments were performed to validate the proposed technique. Comparative results obtained on benchmark datasets are presented. The results show the effectiveness of the proposed algorithm over that of the other algorithms to which it was compared in terms of energy consumption, makespan, and load balancing

    Energy-Efficient Fault-Tolerant Scheduling Algorithm for Real-Time Tasks in Cloud-Based 5G Networks

    Full text link
    © 2013 IEEE. Green computing has become a hot issue for both academia and industry. The fifth-generation (5G) mobile networks put forward a high request for energy efficiency and low latency. The cloud radio access network provides efficient resource use, high performance, and high availability for 5G systems. However, hardware and software faults of cloud systems may lead to failure in providing real-time services. Developing fault tolerance technique can efficiently enhance the reliability and availability of real-time cloud services. The core idea of fault-tolerant scheduling algorithm is introducing redundancy to ensure that the tasks can be finished in the case of permanent or transient system failure. Nevertheless, the redundancy incurs extra overhead for cloud systems, which results in considerable energy consumption. In this paper, we focus on the problem of how to reduce the energy consumption when providing fault tolerance. We first propose a novel primary-backup-based fault-tolerant scheduling architecture for real-time tasks in the cloud environment. Based on the architecture, we present an energy-efficient fault-tolerant scheduling algorithm for real-time tasks (EFTR). EFTR adopts a proactive strategy to increase the system processing capacity and employs a rearrangement mechanism to improve the resource utilization. Simulation experiments are conducted on the CloudSim platform to evaluate the feasibility and effectiveness of EFTR. Compared with the existing fault-tolerant scheduling algorithms, EFTR shows excellent performance in energy conservation and task schedulability
    • 

    corecore