Search CORE

1,888 research outputs found

Markov Prediction Model for Host Load Detection and VM Placement in Live Migration

Author: Agarwal Anjali
Goel Nishith
Melhem Suhib Bani
Zaman Marzia
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

The design of good host overload/underload detection and virtual machine (VM) placement algorithms plays a vital role in assuring the smoothness of VM live migration. The presence of the dynamic environment that leads to a changing load on the VMs motivates us to propose a Markov prediction model to forecast the future load state of the host. We propose a host load detection algorithm to find the future overutilized/underutilized hosts state to avoid immediate VMs migration. Moreover, we propose a VM placement algorithm to determine the set of candidates hosts to receive the migrated VMs in a way to reduce their VM migrations in near future. We evaluate our proposed algorithms through CloudSim simulation on different types of PlanetLab real and random workloads. The experimental results show that our proposed algorithms have a significant reduction in terms of service-level agreement violation, the number of VM migrations, and other metrics than the other competitive algorithms

Crossref

Concordia University Research Repository

VM Selection Process Management for Live Migration in Cloud Data Centers

Author: Bani Melhem Suhib
Publication venue
Publication date: 01/12/2017
Field of study

With immense success and fast growth within the past few years, cloud computing has been established as the dominant computing paradigm in information technology (IT) industry, wherein it utilizes dissipated resource benefits and supports resource sharing and time access flexibility. The proliferation of cloud computing has resulted in the establishment of large-scale data centers across the world, consisting of hundreds of thousands, even millions of servers. The emerging cloud computing paradigm provides administrators and IT organizations with considerable freedom to dynamically migrate virtualized computing services among physical servers in cloud data centers. Normally, these data centers incur very high investment and operating costs for the computing and network devices as well as for the energy consumption. Virtualization and virtual machine (VM) migration offers significant benefits such as load balancing, server consolidation, online maintenance and proactive fault tolerance along data centers. VM migration relies on how to determine the trigger condition of VM migration, select the target virtual machine, and choose the destination node. As a result, dynamic VM migration in the scope of resource management is becoming a crucial issue to emphasize on optimal resource utilization, maximum throughput, minimum response time, enhancing scalability, avoiding over-provisioning of resources and prevention of overload to make cloud computing successful. Intelligent host underload/overload detection, VM selection, and VM placement are the primary means to address VM migration issue. Therefore, these three problems are considered to be the most common tasks in VM migration. This thesis presents novel techniques, models, and algorithms, for distributed dynamic consolidation of virtual machines in cloud data centers. The goal is to improve the utilization of computing resources and reduce energy consumption under workload independent quality of service constraints. The proposed approaches are distributed and efficient in managing the energy-performance trade-off

Concordia University Research Repository

Enabling virtualization technologies for enhanced cloud computing

Author: Qazi Kashifuddin
Publication venue: Digital Commons @ NJIT
Publication date: 31/01/2015
Field of study

Cloud Computing is a ubiquitous technology that offers various services for individual users, small businesses, as well as large scale organizations. Data-center owners maintain clusters of thousands of machines and lease out resources like CPU, memory, network bandwidth, and storage to clients. For organizations, cloud computing provides the means to offload server infrastructure and obtain resources on demand, which reduces setup costs as well as maintenance overheads. For individuals, cloud computing offers platforms, resources and services that would otherwise be unavailable to them. At the core of cloud computing are various virtualization technologies and the resulting Virtual Machines (VMs). Virtualization enables cloud providers to host multiple VMs on a single Physical Machine (PM). The hallmark of VMs is the inability of the end-user to distinguish them from actual PMs. VMs allow cloud owners such essential features as live migration, which is the process of moving a VM from one PM to another while the VM is running, for various reasons. Features of the cloud such as fault tolerance, geographical server placement, energy management, resource management, big data processing, parallel computing, etc. depend heavily on virtualization technologies. Improvements and breakthroughs in these technologies directly lead to introduction of new possibilities in the cloud. This thesis identifies and proposes innovations for such underlying VM technologies and tests their performance on a cluster of 16 machines with real world benchmarks. Specifically the issues of server load prediction, VM consolidation, live migration, and memory sharing are attempted. First, a unique VM resource load prediction mechanism based on Chaos Theory is introduced that predicts server workloads with high accuracy. Based on these predictions, VMs are dynamically and autonomously relocated to different PMs in the cluster in an attempt to conserve energy. Experimental evaluations with a prototype on real world data- center load traces show that up to 80% of the unused PMs can be freed up and repurposed, with Service Level Objective (SLO) violations as little as 3%. Second, issues in live migration of VMs are analyzed, based on which a new distributed approach is presented that allows network-efficient live migration of VMs. The approach amortizes the transfer of memory pages over the life of the VM, thus reducing network traffic during critical live migration. The prototype reduces network usage by up to 45% and lowers required time by up to 40% for live migration on various real-world loads. Finally, a memory sharing and management approach called ACE-M is demonstrated that enables VMs to share and utilize all the memory available in the cluster remotely. Along with predictions on network and memory, this approach allows VMs to run applications with memory requirements much higher than physically available locally. It is experimentally shown that ACE-M reduces the memory performance degradation by about 75% and achieves a 40% lower network response time for memory intensive VMs. A combination of these innovations to the virtualization technologies can minimize performance degradation of various VM attributes, which will ultimately lead to a better end-user experience

Digital Commons @ New Jersey Institute of Technology (NJIT)

CloudBench: an integrated evaluation of VM placement algorithms in clouds

Author: Carretero Pérez Jesús
Gomez Rodriguez Mario A.
González José Luis
Gómez Rodríguez Mario A.
Sosa-Sosa Victor J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/01/2020
Field of study

A complex and important task in the cloud resource management is the efficient allocation of virtual machines (VMs), or containers, in physical machines (PMs). The evaluation of VM placement techniques in real-world clouds can be tedious, complex and time-consuming. This situation has motivated an increasing use of cloud simulators that facilitate this type of evaluations. However, most of the reported VM placement techniques based on simulations have been evaluated taking into account one specific cloud resource (e.g., CPU), whereas values often unrealistic are assumed for other resources (e.g., RAM, awaiting times, application workloads, etc.). This situation generates uncertainty, discouraging their implementations in real-world clouds. This paper introduces CloudBench, a methodology to facilitate the evaluation and deployment of VM placement strategies in private clouds. CloudBench considers the integration of a cloud simulator with a real-world private cloud. Two main tools were developed to support this methodology, a specialized multi-resource cloud simulator (CloudBalanSim), which is in charge of evaluating VM placement techniques, and a distributed resource manager (Balancer), which deploys and tests in a real-world private cloud the best VM placement configurations that satisfied user requirements defined in the simulator. Both tools generate feedback information, from the evaluation scenarios and their obtained results, which is used as a learning asset to carry out intelligent and faster evaluations. The experiments implemented with the CloudBench methodology showed encouraging results as a new strategy to evaluate and deploy VM placement algorithms in the cloud.This work was partially funded by the Spanish Ministry of Economy, Industry and Competitiveness under the Grant TIN2016-79637-P “Towards Unifcation of HPC and Big Data Paradigms” and by the Mexican Council of Science and Technology (CONACYT) through a Ph.D. Grant (No. 212677)

Universidad Carlos III de Madrid e-Archivo

Clustering Algorithms for Scale-free Networks and Applications to Cloud Resource Management

Author: Marinescu Dan C.
Paya Ashkan
Publication venue
Publication date: 14/05/2013
Field of study

In this paper we introduce algorithms for the construction of scale-free networks and for clustering around the nerve centers, nodes with a high connectivity in a scale-free networks. We argue that such overlay networks could support self-organization in a complex system like a cloud computing infrastructure and allow the implementation of optimal resource management policies.Comment: 14 pages, 8 Figurs, Journa

arXiv.org e-Print Archive

CiteSeerX

Learning a goal-oriented model for energy efficient adaptive applications in data centers

Author: Barbara Pernici
Monica Vitali
Una-May O’Reilly
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

This work has been motivated by the growing demand of energy coming from the IT sector. We propose a goal-oriented approach where the state of the system is assessed using a set of indicators. These indicators are evaluated against thresholds that are used as goals of our system. We propose a self-adaptive context-aware framework, where we learn both the relations existing between the indicators and the effect of the available actions over the indicators state. The system is also able to respond to changes in the environment, keeping these relations updated to the current situation. Results have shown that the proposed methodology is able to create a network of relations between indicators and to propose an effective set of repair actions to contrast suboptimal states of the data center. The proposed framework is an important tool for assisting the system administrator in the management of a data center oriented towards Energy Efficiency (EE), showing him the connections occurring between the sometimes contrasting goals of the system and suggesting the most likely successful repair action(s) to improve the system state, both in terms of EE and QoS

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Recommended from our members

Failure Prediction using Machine Learning in a Virtualised HPC System and application

Author: Awan Irfan U.
Bashir Mohammed
Muhammad Y.
Ugail Hassan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/03/2019
Field of study

YesFailure is an increasingly important issue in high performance computing and cloud systems. As large-scale systems continue to grow in scale and complexity, mitigating the impact of failure and providing accurate predictions with sufficient lead time remains a challenging research problem. Traditional existing fault-tolerance strategies such as regular check-pointing and replication are not adequate because of the emerging complexities of high performance computing systems. This necessitates the importance of having an effective as well as proactive failure management approach in place aimed at minimizing the effect of failure within the system. With the advent of machine learning techniques, the ability to learn from past information to predict future pattern of behaviours makes it possible to predict potential system failure more accurately. Thus, in this paper, we explore the predictive abilities of machine learning by applying a number of algorithms to improve the accuracy of failure prediction. We have developed a failure prediction model using time series and machine learning, and performed comparison based tests on the prediction accuracy. The primary algorithms we considered are the Support Vector Machine (SVM), Random Forest(RF), k-Nearest Neighbors (KNN), Classi cation and Regression Trees (CART) and Linear Discriminant Analysis (LDA). Experimental results indicates that the average prediction accuracy of our model using SVM when predicting failure is 90% accurate and effective compared to other algorithms. This f inding implies that our method can effectively predict all possible future system and application failures within the system.Petroleum Technology Development Fund (PTDF) funding support under the OSS scheme with grant number (PTDF/E/OSS/PHD/MB/651/14

Bradford Scholars

Failure prediction using machine learning in a virtualised HPC system and application

Author: Awan Irfan
Mohammed Bashir
Ugail Hassan
Younas Muhammad
Publication venue
Publication date: 01/01/2019
Field of study

Failure is an increasingly important issue in high performance computing and cloud systems. As large-scale systems continue to grow in scale and complexity, mitigating the impact of failure and providing accurate predictions with sufficient lead time remains a challenging research problem. Traditional existing fault-tolerance strategies such as regular check-pointing and replication are not adequate because of the emerging complexities of high performance computing systems. This necessitates the importance of having an effective as well as proactive failure management approach in place aimed at minimizing the effect of failure within the system. With the advent of machine learning techniques, the ability to learn from past information to predict future pattern of behaviours makes it possible to predict potential system failure more accurately. Thus, in this paper, we explore the predictive abilities of machine learning by applying a number of algorithms to improve the accuracy of failure prediction. We have developed a failure prediction model using time series and machine learning, and performed comparison based tests on the prediction accuracy. The primary algorithms we considered are the support vector machine (SVM), random forest (RF), k-nearest neighbors (KNN), classification and regression trees (CART) and linear discriminant analysis (LDA). Experimental results indicates that the average prediction accuracy of our model using SVM when predicting failure is 90% accurate and effective compared to other algorithms. This finding implies that our method can effectively predict all possible future system and application failures within the system

Oxford Brookes University: RADAR

CLOUD RESOURCE MANAGEMENT USING A HIERARCHICAL DECENTRALIZED FRAMEWORK

Author: Hummaida Abdul
Publication venue
Publication date: 31/12/2022
Field of study

The University of Manchester - Institutional Repository

Autonomic management of virtualized resources in cloud computing

Author: Rao Jia
Publication venue: DigitalCommons@WayneState
Publication date: 01/01/2011
Field of study

The last five years have witnessed a rapid growth of cloud computing in business, governmental and educational IT deployment. The success of cloud services depends critically on the effective management of virtualized resources. A key requirement of cloud management is the ability to dynamically match resource allocations to actual demands, To this end, we aim to design and implement a cloud resource management mechanism that manages underlying complexity, automates resource provisioning and controls client-perceived quality of service (QoS) while still achieving resource efficiency. The design of an automatic resource management centers on two questions: when to adjust resource allocations and how much to adjust. In a cloud, applications have different definitions on capacity and cloud dynamics makes it difficult to determine a static resource to performance relationship. In this dissertation, we have proposed a generic metric that measures application capacity, designed model-independent and adaptive approaches to manage resources and built a cloud management system scalable to a cluster of machines. To understand web system capacity, we propose to use a metric of productivity index (PI), which is defined as the ratio of yield to cost, to measure the system processing capability online. PI is a generic concept that can be applied to different levels to monitor system progress in order to identify if more capacity is needed. We applied the concept of PI to the problem of overload prevention in multi-tier websites. The overload predictor built on the PI metric shows more accurate and responsive overload prevention compared to conventional approaches. To address the issue of the lack of accurate server model, we propose a model-independent fuzzy control based approach for CPU allocation. For adaptive and stable control performance, we embed the controller with self-tuning output amplification and flexible rule selection. Finally, we build a QoS provisioning framework that supports multi-objective QoS control and service differentiation. Experiments on a virtual cluster with two service classes show the effectiveness of our approach in both performance and power control. To address the problems of complex interplay between resources and process delays in fine-grained multi-resource allocation, we consider capacity management as a decision-making problem and employ reinforcement learning (RL) to optimize the process. The optimization depends on the trial-and-error interactions with the cloud system. In order to improve the initial management performance, we propose a model-based RL algorithm. The neural network based environment model, which is learned from previous management history, generates simulated resource allocations for the RL agent. Experiment results on heterogeneous applications show that our approach makes efficient use of limited interactions and find near optimal resource configurations within 7 steps. Finally, we present a distributed reinforcement learning approach to the cluster-wide cloud resource management. We decompose the cluster-wide resource allocation problem into sub-problems concerning individual VM resource configurations. The cluster-wide allocation is optimized if individual VMs meet their SLA with a high resource utilization. For scalability, we develop an efficient reinforcement learning approach with continuous state space. For adaptability, we use VM low-level runtime statistics to accommodate workload dynamics. Prototyped in a iBalloon system, the distributed learning approach successfully manages 128 VMs on a 16-node close correlated cluster

Digital Commons@Wayne State University