1,149 research outputs found

    Energy-Aware Cloud Management through Progressive SLA Specification

    Full text link
    Novel energy-aware cloud management methods dynamically reallocate computation across geographically distributed data centers to leverage regional electricity price and temperature differences. As a result, a managed VM may suffer occasional downtimes. Current cloud providers only offer high availability VMs, without enough flexibility to apply such energy-aware management. In this paper we show how to analyse past traces of dynamic cloud management actions based on electricity prices and temperatures to estimate VM availability and price values. We propose a novel SLA specification approach for offering VMs with different availability and price values guaranteed over multiple SLAs to enable flexible energy-aware cloud management. We determine the optimal number of such SLAs as well as their availability and price guaranteed values. We evaluate our approach in a user SLA selection simulation using Wikipedia and Grid'5000 workloads. The results show higher customer conversion and 39% average energy savings per VM.Comment: 14 pages, conferenc

    SLA Management in Intent-Driven Service Management Systems: A Taxonomy and Future Directions

    Full text link
    Traditionally, network and system administrators are responsible for designing, configuring, and resolving the Internet service requests. Human-driven system configuration and management are proving unsatisfactory due to the recent interest in time-sensitive applications with stringent quality of service (QoS). Aiming to transition from the traditional human-driven to zero-touch service management in the field of networks and computing, intent-driven service management (IDSM) has been proposed as a response to stringent quality of service requirements. In IDSM, users express their service requirements in a declarative manner as intents. IDSM, with the help of closed control-loop operations, perform configurations and deployments, autonomously to meet service request requirements. The result is a faster deployment of Internet services and reduction in configuration errors caused by manual operations, which in turn reduces the service-level agreement (SLA) violations. In the early stages of development, IDSM systems require attention from industry as well as academia. In an attempt to fill the gaps in current research, we conducted a systematic literature review of SLA management in IDSM systems. As an outcome, we have identified four IDSM intent management activities and proposed a taxonomy for each activity. Analysis of all studies and future research directions, are presented in the conclusions.Comment: Extended version of the preprint submitted at ACM Computing Surveys (CSUR

    Progressive introduction of network softwarization in operational telecom networks: advances at architectural, service and transport levels

    Get PDF
    Technological paradigms such as Software Defined Networking, Network Function Virtualization and Network Slicing are altogether offering new ways of providing services. This process is widely known as Network Softwarization, where traditional operational networks adopt capabilities and mechanisms inherit form the computing world, such as programmability, virtualization and multi-tenancy. This adoption brings a number of challenges, both from the technological and operational perspectives. On the other hand, they provide an unprecedented flexibility opening opportunities to developing new services and new ways of exploiting and consuming telecom networks. This Thesis first overviews the implications of the progressive introduction of network softwarization in operational networks for later on detail some advances at different levels, namely architectural, service and transport levels. It is done through specific exemplary use cases and evolution scenarios, with the goal of illustrating both new possibilities and existing gaps for the ongoing transition towards an advanced future mode of operation. This is performed from the perspective of a telecom operator, paying special attention on how to integrate all these paradigms into operational networks for assisting on their evolution targeting new, more sophisticated service demands.Programa de Doctorado en Ingeniería Telemática por la Universidad Carlos III de MadridPresidente: Eduardo Juan Jacob Taquet.- Secretario: Francisco Valera Pintor.- Vocal: Jorge López Vizcaín

    START: Straggler Prediction and Mitigation for Cloud Computing Environments using Encoder LSTM Networks

    Get PDF
    A common performance problem in large-scale cloud systems is dealing with straggler tasks that are slow running instances which increase the overall response time. Such tasks impact the system's QoS and the SLA. There is a need for automatic straggler detection and mitigation mechanisms that execute jobs without violating the SLA. Prior work typically builds reactive models that focus first on detection and then mitigation of straggler tasks, which leads to delays. Other works use prediction based proactive mechanisms, but ignore volatile task characteristics. We propose a Straggler Prediction and Mitigation Technique (START) that is able to predict which tasks might be stragglers and dynamically adapt scheduling to achieve lower response times. START analyzes all tasks and hosts based on compute and network resource consumption using an Encoder LSTM network to predict and mitigate expected straggler tasks. This reduces the SLA violation rate and execution time without compromising QoS. Specifically, we use the CloudSim toolkit to simulate START and compare it with IGRU-SD, SGC, Dolly, GRASS, NearestFit and Wrangler in terms of QoS parameters. Experiments show that START reduces execution time, resource contention, energy and SLA violations by 13%, 11%, 16%, 19%, compared to the state-of-the-art

    Scalable and Distributed Resource Management Protocols for Cloud and Big Data Clusters

    Get PDF
    Cloud data centers require an operating system to manage resources and satisfy operational requirements and management objectives. The growth of popularity in cloud services causes the appearance of a new spectrum of services with sophisticated workload and resource management requirements. Also, data centers are growing by addition of various type of hardware to accommodate the ever-increasing requests of users. Nowadays a large percentage of cloud resources are executing data-intensive applications which need continuously changing workload fluctuations and specific resource management. To this end, cluster computing frameworks are shifting towards distributed resource management for better scalability and faster decision making. Such systems benefit from the parallelization of control and are resilient to failures. Throughout this thesis we investigate algorithms, protocols and techniques to address these challenges in large-scale data centers. We introduce a distributed resource management framework which consolidates virtual machine to as few servers as possible to reduce the energy consumption of data center and hence decrease the cost of cloud providers. This framework can characterize the workload of virtual machines and hence handle trade-off energy consumption and Service Level Agreement (SLA) of customers efficiently. The algorithm is highly scalable and requires low maintenance cost with dynamic workloads and it tries to minimize virtual machines migration costs. We also introduce a scalable and distributed probe-based scheduling algorithm for Big data analytics frameworks. This algorithm can efficiently address the problem job heterogeneity in workloads that has appeared after increasing the level of parallelism in jobs. The algorithm is massively scalable and can reduce significantly average job completion times in comparison with the-state of-the-art. Finally, we propose a probabilistic fault-tolerance technique as part of the scheduling algorithm

    Pertanika Journal of Science & Technology

    Get PDF

    Pertanika Journal of Science & Technology

    Get PDF
    corecore