1,416 research outputs found

    Enabling virtualization technologies for enhanced cloud computing

    Get PDF
    Cloud Computing is a ubiquitous technology that offers various services for individual users, small businesses, as well as large scale organizations. Data-center owners maintain clusters of thousands of machines and lease out resources like CPU, memory, network bandwidth, and storage to clients. For organizations, cloud computing provides the means to offload server infrastructure and obtain resources on demand, which reduces setup costs as well as maintenance overheads. For individuals, cloud computing offers platforms, resources and services that would otherwise be unavailable to them. At the core of cloud computing are various virtualization technologies and the resulting Virtual Machines (VMs). Virtualization enables cloud providers to host multiple VMs on a single Physical Machine (PM). The hallmark of VMs is the inability of the end-user to distinguish them from actual PMs. VMs allow cloud owners such essential features as live migration, which is the process of moving a VM from one PM to another while the VM is running, for various reasons. Features of the cloud such as fault tolerance, geographical server placement, energy management, resource management, big data processing, parallel computing, etc. depend heavily on virtualization technologies. Improvements and breakthroughs in these technologies directly lead to introduction of new possibilities in the cloud. This thesis identifies and proposes innovations for such underlying VM technologies and tests their performance on a cluster of 16 machines with real world benchmarks. Specifically the issues of server load prediction, VM consolidation, live migration, and memory sharing are attempted. First, a unique VM resource load prediction mechanism based on Chaos Theory is introduced that predicts server workloads with high accuracy. Based on these predictions, VMs are dynamically and autonomously relocated to different PMs in the cluster in an attempt to conserve energy. Experimental evaluations with a prototype on real world data- center load traces show that up to 80% of the unused PMs can be freed up and repurposed, with Service Level Objective (SLO) violations as little as 3%. Second, issues in live migration of VMs are analyzed, based on which a new distributed approach is presented that allows network-efficient live migration of VMs. The approach amortizes the transfer of memory pages over the life of the VM, thus reducing network traffic during critical live migration. The prototype reduces network usage by up to 45% and lowers required time by up to 40% for live migration on various real-world loads. Finally, a memory sharing and management approach called ACE-M is demonstrated that enables VMs to share and utilize all the memory available in the cluster remotely. Along with predictions on network and memory, this approach allows VMs to run applications with memory requirements much higher than physically available locally. It is experimentally shown that ACE-M reduces the memory performance degradation by about 75% and achieves a 40% lower network response time for memory intensive VMs. A combination of these innovations to the virtualization technologies can minimize performance degradation of various VM attributes, which will ultimately lead to a better end-user experience

    A comparison of resource allocation process in grid and cloud technologies

    Get PDF
    Grid Computing and Cloud Computing are two different technologies that have emerged to validate the long-held dream of computing as utilities which led to an important revolution in IT industry. These technologies came with several challenges in terms of middleware, programming model, resources management and business models. These challenges are seriously considered by Distributed System research. Resources allocation is a key challenge in both technologies as it causes the possible resource wastage and service degradation. This paper is addressing a comprehensive study of the resources allocation processes in both technologies. It provides the researchers with an in-depth understanding of all resources allocation related aspects and associative challenges, including: load balancing, performance, energy consumption, scheduling algorithms, resources consolidation and migration. The comparison also contributes an informal definition of the Cloud resource allocation process. Resources in the Cloud are being shared by all users in a time and space sharing manner, in contrast to dedicated resources that governed by a queuing system in Grid resource management. Cloud Resource allocation suffers from extra challenges abbreviated by achieving good load balancing and making right consolidation decision

    epcAware: a game-based, energy, performance and cost efficient resource management technique for multi-access edge computing

    Get PDF
    The Internet of Things (IoT) is producing an extraordinary volume of data daily, and it is possible that the data may become useless while on its way to the cloud for analysis, due to longer distances and delays. Fog/edge computing is a new model for analyzing and acting on time-sensitive data (real-time applications) at the network edge, adjacent to where it is produced. The model sends only selected data to the cloud for analysis and long-term storage. Furthermore, cloud services provided by large companies such as Google, can also be localized to minimize the response time and increase service agility. This could be accomplished through deploying small-scale datacenters (reffered to by name as cloudlets) where essential, closer to customers (IoT devices) and connected to a centrealised cloud through networks - which form a multi-access edge cloud (MEC). The MEC setup involves three different parties, i.e. service providers (IaaS), application providers (SaaS), network providers (NaaS); which might have different goals, therefore, making resource management a defficult job. In the literature, various resource management techniques have been suggested in the context of what kind of services should they host and how the available resources should be allocated to customers’ applications, particularly, if mobility is involved. However, the existing literature considers the resource management problem with respect to a single party. In this paper, we assume resource management with respect to all three parties i.e. IaaS, SaaS, NaaS; and suggest a game theoritic resource management technique that minimises infrastructure energy consumption and costs while ensuring applications performance. Our empirical evaluation, using real workload traces from Google’s cluster, suggests that our approach could reduce up to 11.95% energy consumption, and approximately 17.86% user costs with negligible loss in performance. Moreover, IaaS can reduce up to 20.27% energy bills and NaaS can increase their costs savings up to 18.52% as compared to other methods

    Learning-based run-time power and energy management of multi/many-core systems: current and future trends

    Get PDF
    Multi/Many-core systems are prevalent in several application domains targeting different scales of computing such as embedded and cloud computing. These systems are able to fulfil the everincreasing performance requirements by exploiting their parallel processing capabilities. However, effective power/energy management is required during system operations due to several reasons such as to increase the operational time of battery operated systems, reduce the energy cost of datacenters, and improve thermal efficiency and reliability. This article provides an extensive survey of learning-based run-time power/energy management approaches. The survey includes a taxonomy of the learning-based approaches. These approaches perform design-time and/or run-time power/energy management by employing some learning principles such as reinforcement learning. The survey also highlights the trends followed by the learning-based run-time power management approaches, their upcoming trends and open research challenges

    Scalable and Distributed Resource Management Protocols for Cloud and Big Data Clusters

    Get PDF
    Cloud data centers require an operating system to manage resources and satisfy operational requirements and management objectives. The growth of popularity in cloud services causes the appearance of a new spectrum of services with sophisticated workload and resource management requirements. Also, data centers are growing by addition of various type of hardware to accommodate the ever-increasing requests of users. Nowadays a large percentage of cloud resources are executing data-intensive applications which need continuously changing workload fluctuations and specific resource management. To this end, cluster computing frameworks are shifting towards distributed resource management for better scalability and faster decision making. Such systems benefit from the parallelization of control and are resilient to failures. Throughout this thesis we investigate algorithms, protocols and techniques to address these challenges in large-scale data centers. We introduce a distributed resource management framework which consolidates virtual machine to as few servers as possible to reduce the energy consumption of data center and hence decrease the cost of cloud providers. This framework can characterize the workload of virtual machines and hence handle trade-off energy consumption and Service Level Agreement (SLA) of customers efficiently. The algorithm is highly scalable and requires low maintenance cost with dynamic workloads and it tries to minimize virtual machines migration costs. We also introduce a scalable and distributed probe-based scheduling algorithm for Big data analytics frameworks. This algorithm can efficiently address the problem job heterogeneity in workloads that has appeared after increasing the level of parallelism in jobs. The algorithm is massively scalable and can reduce significantly average job completion times in comparison with the-state of-the-art. Finally, we propose a probabilistic fault-tolerance technique as part of the scheduling algorithm
    corecore