Search CORE

3,187 research outputs found

A Survey on Load Balancing Algorithms for VM Placement in Cloud Computing

Author: Buyya Rajkumar
Tian Wenhong
Xu Minxian
Publication venue: 'Wiley'
Publication date: 08/02/2017
Field of study

The emergence of cloud computing based on virtualization technologies brings huge opportunities to host virtual resource at low cost without the need of owning any infrastructure. Virtualization technologies enable users to acquire, configure and be charged on pay-per-use basis. However, Cloud data centers mostly comprise heterogeneous commodity servers hosting multiple virtual machines (VMs) with potential various specifications and fluctuating resource usages, which may cause imbalanced resource utilization within servers that may lead to performance degradation and service level agreements (SLAs) violations. To achieve efficient scheduling, these challenges should be addressed and solved by using load balancing strategies, which have been proved to be NP-hard problem. From multiple perspectives, this work identifies the challenges and analyzes existing algorithms for allocating VMs to PMs in infrastructure Clouds, especially focuses on load balancing. A detailed classification targeting load balancing algorithms for VM placement in cloud data centers is investigated and the surveyed algorithms are classified according to the classification. The goal of this paper is to provide a comprehensive and comparative understanding of existing literature and aid researchers by providing an insight for potential future enhancements.Comment: 22 Pages, 4 Figures, 4 Tables, in pres

arXiv.org e-Print Archive

University of Melbourne Institutional Repository

HPC Cloud for Scientific and Business Applications: Taxonomy, Vision, and Research Challenges

Author: Buyya Rajkumar
Calheiros Rodrigo N.
Cunha Renato L. F.
Netto Marco A. S.
Rodrigues Eduardo R.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

High Performance Computing (HPC) clouds are becoming an alternative to on-premise clusters for executing scientific applications and business analytics services. Most research efforts in HPC cloud aim to understand the cost-benefit of moving resource-intensive applications from on-premise environments to public cloud platforms. Industry trends show hybrid environments are the natural path to get the best of the on-premise and cloud resources---steady (and sensitive) workloads can run on on-premise resources and peak demand can leverage remote resources in a pay-as-you-go manner. Nevertheless, there are plenty of questions to be answered in HPC cloud, which range from how to extract the best performance of an unknown underlying platform to what services are essential to make its usage easier. Moreover, the discussion on the right pricing and contractual models to fit small and large users is relevant for the sustainability of HPC clouds. This paper brings a survey and taxonomy of efforts in HPC cloud and a vision on what we believe is ahead of us, including a set of research challenges that, once tackled, can help advance businesses and scientific discoveries. This becomes particularly relevant due to the fast increasing wave of new HPC applications coming from big data and artificial intelligence.Comment: 29 pages, 5 figures, Published in ACM Computing Surveys (CSUR

arXiv.org e-Print Archive

Western Sydney ResearchDirect

Fail Over Strategy for Fault Tolerance in Cloud Computing Environment

Author: Agbaria
Alshareef
Amoon
Bala
Bertolli
Bilal
Bin
Bin Hong
Chen
Chen
Chtepen
Elliott
Fu
Greenberg
Jung
Kaur
Kim
Malik
Maloney
Nazari Cheraghlou
Okorafor
Pantic
Paul
Pei
Qiang
Salehi
Sen
Sheng
Singh
Singh
Siva Sathya
Sun
Publication venue: 'Wiley'
Publication date: 05/04/2017
Field of study

YesCloud fault tolerance is an important issue in cloud computing platforms and applications. In the event of an unexpected system failure or malfunction, a robust fault-tolerant design may allow the cloud to continue functioning correctly possibly at a reduced level instead of failing completely. To ensure high availability of critical cloud services, the application execution and hardware performance, various fault tolerant techniques exist for building self-autonomous cloud systems. In comparison to current approaches, this paper proposes a more robust and reliable architecture using optimal checkpointing strategy to ensure high system availability and reduced system task service finish time. Using pass rates and virtualised mechanisms, the proposed Smart Failover Strategy (SFS) scheme uses components such as Cloud fault manager, Cloud controller, Cloud load balancer and a selection mechanism, providing fault tolerance via redundancy, optimized selection and checkpointing. In our approach, the Cloud fault manager repairs faults generated before the task time deadline is reached, blocking unrecoverable faulty nodes as well as their virtual nodes. This scheme is also able to remove temporary software faults from recoverable faulty nodes, thereby making them available for future request. We argue that the proposed SFS algorithm makes the system highly fault tolerant by considering forward and backward recovery using diverse software tools. Compared to existing approaches, preliminary experiment of the SFS algorithm indicate an increase in pass rates and a consequent decrease in failure rates, showing an overall good performance in task allocations. We present these results using experimental validation tools with comparison to other techniques, laying a foundation for a fully fault tolerant IaaS Cloud environment

Crossref

eScholarship - University of California

Bradford Scholars

Cloud engineering is search based software engineering too

Author: Abadi
Afzal
Afzal
Afzal
Ali
Alshahwan
Arcuri
Armbrust
Barham
Barroso
Beckman
Beloglazov
Ben-Yehuda
Cadar
Calheiros
Carzaniga
Chang
Cheng
Cliff
Cohen
Cooper
Cornford
Darley
De Millo
DeCandia
Dijkstra
Durillo
Emberson
Fan
Fatiregun
Forrest
Fraser
Freitas
Guo
Harman
Harman
Harman
Harman
Harman
Harman
Harman
Harman
Harman
Harman
Harman
Hoare
Hoare
Hoare
Hoste
Jacobs
Jakobovic
Jakobović
Jia
Jones
Justafort
Kirkpatrick
Kliazovich
Koza
Lagar-cavilla
Lakhotia
Lakhotia
Langdon
Le Goues
Le Goues
Lee
Lutz
Madhavapeddy
McMinn
Mell
Mishra
Mitchell
Narzisi
Nurmi
Papazoglou
Rappa
Reese
Rogers
Ryan
Räihä
Silva
Sitthi-Amorn
Sotomayor
Srikantaiah
Stillwell
Viegas
Vishwanath
Voorsluys
Wegener
Weimer
White
White
Whitley
Williams
Yoo
Zhang
Zhang
Publication venue: 'Elsevier BV'
Publication date: 01/09/2013
Field of study

Many of the problems posed by the migration of computation to cloud platforms can be formulated and solved using techniques associated with Search Based Software Engineering (SBSE). Much of cloud software engineering involves problems of optimisation: performance, allocation, assignment and the dynamic balancing of resources to achieve pragmatic trade-offs between many competing technical and business objectives. SBSE is concerned with the application of computational search and optimisation to solve precisely these kinds of software engineering challenges. Interest in both cloud computing and SBSE has grown rapidly in the past five years, yet there has been little work on SBSE as a means of addressing cloud computing challenges. Like many computationally demanding activities, SBSE has the potential to benefit from the cloud; ‘SBSE in the cloud’. However, this paper focuses, instead, of the ways in which SBSE can benefit cloud computing. It thus develops the theme of ‘SBSE for the cloud’, formulating cloud computing challenges in ways that can be addressed using SBSE

Elsevier - Publisher Connector

Crossref

UCL Discovery

Enlighten

Optimising Fault Tolerance in Real-time Cloud Computing IaaS Environment

Author: Awan Irfan U.
Kiran Mariam
Maiyama Kabiru M.
Mohammed Bashir
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/2016
Field of study

YesFault tolerance is the ability of a system to respond swiftly to an unexpected failure. Failures in a cloud computing environment are normal rather than exceptional, but fault detection and system recovery in a real time cloud system is a crucial issue. To deal with this problem and to minimize the risk of failure, an optimal fault tolerance mechanism was introduced where fault tolerance was achieved using the combination of the Cloud Master, Compute nodes, Cloud load balancer, Selection mechanism and Cloud Fault handler. In this paper, we proposed an optimized fault tolerance approach where a model is designed to tolerate faults based on the reliability of each compute node (virtual machine) and can be replaced if the performance is not optimal. Preliminary test of our algorithm indicates that the rate of increase in pass rate exceeds the decrease in failure rate and it also considers forward and backward recovery using diverse software tools. Our results obtained are demonstrated through experimental validation thereby laying a foundation for a fully fault tolerant IaaS Cloud environment, which suggests a good performance of our model compared to current existing approaches.Petroleum Technology Development Fund (PTDF

Crossref

Bradford Scholars