1,903 research outputs found

    Stochastic Reward Net-based Modeling Approach for Availability Quantification of Data Center Systems

    Get PDF
    Availability quantification and prediction of IT infrastructure in data centers are of paramount importance for online business enterprises. In this chapter, we present comprehensive availability models for practical case studies in order to demonstrate a state-space stochastic reward net model for typical data center systems for quantitative assessment of system availability. We present stochastic reward net models of a virtualized server system, a data center network based on DCell topology, and a conceptual data center for disaster tolerance. The systems are then evaluated against various metrics of interest, including steady state availability, downtime and downtime cost, and sensitivity analysis

    Scheduling of data-intensive workloads in a brokered virtualized environment

    Full text link
    Providing performance predictability guarantees is increasingly important in cloud platforms, especially for data-intensive applications, for which performance depends greatly on the available rates of data transfer between the various computing/storage hosts underlying the virtualized resources assigned to the application. With the increased prevalence of brokerage services in cloud platforms, there is a need for resource management solutions that consider the brokered nature of these workloads, as well as the special demands of their intra-dependent components. In this paper, we present an offline mechanism for scheduling batches of brokered data-intensive workloads, which can be extended to an online setting. The objective of the mechanism is to decide on a packing of the workloads in a batch that minimizes the broker's incurred costs, Moreover, considering the brokered nature of such workloads, we define a payment model that provides incentives to these workloads to be scheduled as part of a batch, which we analyze theoretically. Finally, we evaluate the proposed scheduling algorithm, and exemplify the fairness of the payment model in practical settings via trace-based experiments

    Availability model for virtualized platforms

    Get PDF
    Network virtualization is a method of providing virtual instances of physical networks. Virtualized networks are widely used with virtualized servers, forming a powerful dynamically reconfigurable platform. In this paper we discuss the impact of network virtualization on the overall system availability. We describe a system reflecting the network architecture usually deployed in today’s data centres. The proposed system is modelled using Markov chains and fault trees. We compare the availability of virtualized system using standard physique network with the availability of virtualized system using virtualized network. Network virtualization introduces a new software layer to the network architecture. The proposed availability model integrates software failures in addition to the hardware failures. Based on the estimated numerical failure rates, we analyse system’s availability

    Modeling and analysis of high availability techniques in a virtualized system

    Get PDF
    Availability evaluation of a virtualized system is critical to the wide deployment of cloud computing services. Time-based, prediction-based rejuvenation of virtual machines (VM) and virtual machine monitors, VM failover and live VM migration are common high-availability (HA) techniques in a virtualized system. This paper investigates the effect of combination of these availability techniques on VM availability in a virtualized system where various software and hardware failures may occur. For each combination, we construct analytic models rejuvenation mechanisms to improve VM availability; (2) prediction-based rejuvenation enhances VM availability much more than time-based VM rejuvenation when prediction successful probability is above 70%, regardless failover and/or live VM migration is also deployed; (3) failover mechanism outperforms live VM migration, although they can work together for higher availability of VM. In addition, they can combine with software rejuvenation mechanisms for even higher availability; (4) and time interval setting is critical to a time-based rejuvenation mechanism. These analytic results provide guidelines for deploying and parameter setting of HA techniques in a virtualized system

    Technical Report on Deploying a highly secured OpenStack Cloud Infrastructure using BradStack as a Case Study

    Full text link
    Cloud computing has emerged as a popular paradigm and an attractive model for providing a reliable distributed computing model.it is increasing attracting huge attention both in academic research and industrial initiatives. Cloud deployments are paramount for institution and organizations of all scales. The availability of a flexible, free open source cloud platform designed with no propriety software and the ability of its integration with legacy systems and third-party applications are fundamental. Open stack is a free and opensource software released under the terms of Apache license with a fragmented and distributed architecture making it highly flexible. This project was initiated and aimed at designing a secured cloud infrastructure called BradStack, which is built on OpenStack in the Computing Laboratory at the University of Bradford. In this report, we present and discuss the steps required in deploying a secured BradStack Multi-node cloud infrastructure and conducting Penetration testing on OpenStack Services to validate the effectiveness of the security controls on the BradStack platform. This report serves as a practical guideline, focusing on security and practical infrastructure related issues. It also serves as a reference for institutions looking at the possibilities of implementing a secured cloud solution.Comment: 38 pages, 19 figures

    Straggler Root-Cause and Impact Analysis for Massive-scale Virtualized Cloud Datacenters

    Get PDF
    Increased complexity and scale of virtualized distributed systems has resulted in the manifestation of emergent phenomena substantially affecting overall system performance. This phenomena is known as “Long Tail”, whereby a small proportion of task stragglers significantly impede job completion time. While work focuses on straggler detection and mitigation, there is limited work that empirically studies straggler root-cause and quantifies its impact upon system operation. Such analysis is critical to ascertain in-depth knowledge of straggler occurrence for focusing developmental and research efforts towards solving the Long Tail challenge. This paper provides an empirical analysis of straggler root-cause within virtualized Cloud datacenters; we analyze two large-scale production systems to quantify the frequency and impact stragglers impose, and propose a method for conducting root-cause analysis. Results demonstrate approximately 5% of task stragglers impact 50% of total jobs for batch processes, and 53% of stragglers occur due to high server resource utilization. We leverage these findings to propose a method for extreme straggler detection through a combination of offline execution patterns modeling and online analytic agents to monitor tasks at runtime. Experiments show the approach is capable of detecting stragglers less than 11% into their execution lifecycle with 95% accuracy for short duration jobs
    corecore