3,378 research outputs found

    Proactive cloud management for highly heterogeneous multi-cloud infrastructures

    Get PDF
    Various literature studies demonstrated that the cloud computing paradigm can help to improve availability and performance of applications subject to the problem of software anomalies. Indeed, the cloud resource provisioning model enables users to rapidly access new processing resources, even distributed over different geographical regions, that can be promptly used in the case of, e.g., crashes or hangs of running machines, as well as to balance the load in the case of overloaded machines. Nevertheless, managing a complex geographically-distributed cloud deploy could be a complex and time-consuming task. Autonomic Cloud Manager (ACM) Framework is an autonomic framework for supporting proactive management of applications deployed over multiple cloud regions. It uses machine learning models to predict failures of virtual machines and to proactively redirect the load to healthy machines/cloud regions. In this paper, we study different policies to perform efficient proactive load balancing across cloud regions in order to mitigate the effect of software anomalies. These policies use predictions about the mean time to failure of virtual machines. We consider the case of heterogeneous cloud regions, i.e regions with different amount of resources, and we provide an experimental assessment of these policies in the context of ACM Framework

    Dependability Models for Designing Disaster Tolerant Cloud Computing Systems

    Get PDF
    Abstract—Hundreds of natural disasters occur in many parts of the world every year, causing billions of dollars in damages. This fact contrasts with the high availability requirement of cloud computing systems, and, to protect such systems from unforeseen catastrophe, a recovery plan requires the utilization of different data centers located far enough apart. However, the time to migrate a VM from a data center to another increases due to distance. This work presents dependability models for evaluating distributed cloud computing systems deployed into multiple data centers considering disaster occurrence. Additionally, we present a case study which evaluates several scenarios with different VM migration times and distances between data centers. Keywords-cloud computing; dependability evaluation; stochastic Petri nets; I

    Computing at massive scale: Scalability and dependability challenges

    Get PDF
    Large-scale Cloud systems and big data analytics frameworks are now widely used for practical services and applications. However, with the increase of data volume, together with the heterogeneity of workloads and resources, and the dynamic nature of massive user requests, the uncertainties and complexity of resource management and service provisioning increase dramatically, often resulting in poor resource utilization, vulnerable system dependability, and user-perceived performance degradations. In this paper we report our latest understanding of the current and future challenges in this particular area, and discuss both existing and potential solutions to the problems, especially those concerned with system efficiency, scalability and dependability. We first introduce a data-driven analysis methodology for characterizing the resource and workload patterns and tracing performance bottlenecks in a massive-scale distributed computing environment. We then examine and analyze several fundamental challenges and the solutions we are developing to tackle them, including for example incremental but decentralized resource scheduling, incremental messaging communication, rapid system failover, and request handling parallelism. We integrate these solutions with our data analysis methodology in order to establish an engineering approach that facilitates the optimization, tuning and verification of massive-scale distributed systems. We aim to develop and offer innovative methods and mechanisms for future computing platforms that will provide strong support for new big data and IoE (Internet of Everything) applications

    Redundant VoD Streaming Service in a Private Cloud: Availability Modeling and Sensitivity Analysis

    Get PDF
    For several years cloud computing has been generating considerable debate and interest within IT corporations. Since cloud computing environments provide storage and processing systems that are adaptable, efficient, and straightforward, thereby enabling rapid infrastructure modifications to be made according to constantly varying workloads, organizations of every size and type are migrating to web-based cloud supported solutions. Due to the advantages of the pay-per-use model and scalability factors, current video on demand (VoD) streaming services rely heavily on cloud infrastructures to offer a large variety of multimedia content. Recent well documented failure events in commercial VoD services have demonstrated the fundamental importance of maintaining high availability in cloud computing infrastructures, and hierarchical modeling has proved to be a useful tool for evaluating the availability of complex systems and services. This paper presents an availability model for a video streaming service deployed in a private cloud environment which includes redundancy mechanisms in the infrastructure. Differential sensitivity analysis was applied to identify and rank the critical components of the system with respect to service availability. The results demonstrate that such a modeling strategy combined with differential sensitivity analysis can be an attractive methodology for identifying which components should be supported with redundancy in order to consciously increase system dependability

    Models to evaluate service Provisioning over Cloud Computing Environments - A Blockchain-As-A-Service case study

    Get PDF
    ThestrictnessoftheServiceLevelAgreements(SLAs)ismainlyduetoasetofconstraintsrelated to performance and dependability attributes, such as availability. This paper shows that system’s availability values may be improved by deploying services over a private environment, which may obtain better availability values with improved management, security, and control. However, how much a company needs to afford to keep this improved availability? As an additional activity, this paper compares the obtained availability values with the infrastructure deployment expenses and establishes a cost × benefit relationship. As for the system’s evaluation technique, we choose modeling; while for the service used to demonstrate the models’ feasibility, the blockchain-as-a-service was the selected one. This paper proposes and evaluate four different infrastructures hosting blockchains: (i) baseline; (ii) double redundant; (iii) triple redundant, and (iv) hyper-converged. The obtained results pointed out that the hyper-converged architecture had an advantage over a full triple redundant environment regarding availability and deployment cost
    • …
    corecore