Search CORE

1,117 research outputs found

START: Straggler Prediction and Mitigation for Cloud Computing Environments using Encoder LSTM Networks

Author: Buyya R
Casale G
Garraghan P
Gill SS
Jennings N
Tuli S
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/11/2021
Field of study

A common performance problem in large-scale cloud systems is dealing with straggler tasks that are slow running instances which increase the overall response time. Such tasks impact the system's QoS and the SLA. There is a need for automatic straggler detection and mitigation mechanisms that execute jobs without violating the SLA. Prior work typically builds reactive models that focus first on detection and then mitigation of straggler tasks, which leads to delays. Other works use prediction based proactive mechanisms, but ignore volatile task characteristics. We propose a Straggler Prediction and Mitigation Technique (START) that is able to predict which tasks might be stragglers and dynamically adapt scheduling to achieve lower response times. START analyzes all tasks and hosts based on compute and network resource consumption using an Encoder LSTM network to predict and mitigate expected straggler tasks. This reduces the SLA violation rate and execution time without compromising QoS. Specifically, we use the CloudSim toolkit to simulate START and compare it with IGRU-SD, SGC, Dolly, GRASS, NearestFit and Wrangler in terms of QoS parameters. Experiments show that START reduces execution time, resource contention, energy and SLA violations by 13%, 11%, 16%, 19%, compared to the state-of-the-art

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

Queen Mary Research Online

Lancaster E-Prints

Resource Management and Scheduling for Big Data Applications in Cloud Computing Environments

Author: Islam Muhammed Tawfiqul
Buyya Rajkumar
Publication venue
Publication date: 01/09/2018
Field of study

This chapter presents software architectures of the big data processing platforms. It will provide an in-depth knowledge on resource management techniques involved while deploying big data processing systems on cloud environment. It starts from the very basics and gradually introduce the core components of resource management which we have divided in multiple layers. It covers the state-of-art practices and researches done in SLA-based resource management with a specific focus on the job scheduling mechanisms.Comment: 27 pages, 9 figure

arXiv.org e-Print Archive

Servicio de Difusión de la Creación Intelectual

Low SLA violation and Low Energy consumption using VM Consolidation in Green Cloud Data Centers

Author: Chou Hong-fu
Chou Hung-Pu
Publication venue: New Zealand, Auckland : Auckland University of Technology
Publication date: 01/06/2020
Field of study

Virtual Machines (VM) consolidation is an efficient way towards energy conservation in cloud data centers. The VM consolidation technique is applied to migrate VMs into lesser number of active Physical Machines (PMs), so that the PMs which have no VMs can be turned into sleep state. VM consolidation technique can reduce energy consumption of cloud data centers because of the energy consumption by the PM which is in sleep state. Because of VMs sharing the underlying physical resources, aggressive consolidation of VMs can lead to performance degradation. Furthermore, an application may encounter an unexpected resources requirement which may lead to increased response times or even failures. Before providing cloud services, cloud providers should sign Service Level Agreements (SLA) with customers. To provide reliable Quality of Service (QoS) for cloud providers is quite important of considering this research topic. To strike a tradeoff between energy and performance, minimizing energy consumption on the premise of meeting SLA is considered. One of the optimization challenges is to decide which VMs to migrate, when to migrate, where to migrate, and when and which servers to turn on/off. To achieve this goal optimally, it is important to predict the future host state accurately and make plan for migration of VMs based on the prediction. For example, if a host will be overloaded at next time unit, some VMs should be migrated from the host to keep the host from overloading, and if a host will be underloaded at next time unit, all VMs should be migrated from the host, so that the host can be turned off to save power. The design goal of the controller is to achieve the balance between server energy consumption and application performance. Because of the heterogeneity of cloud resources and various applications in the cloud environment, the workload on hosts is dynamically changing over time. It is essential to develop accurate workload prediction models for effective resource management and allocation. The disadvantage of VM consolidation process in cloud data centers is that they only concentrate on primitive system characteristics such as CPU utilization, memory and the number of active hosts. When originating their models and approaches as the decisive factors, these characteristics ignore the discrepancy in performance-to-power efficiency between heterogeneous infrastructures. Therefore, this is the reason that leads to unreasonable consolidation which may cause redundant number of VM migrations and energy waste. Advance artificial intelligence such as reinforcement learning can learn a management strategy without prior knowledge, which enables us to design a model-free resource allocation control system. For example, VM consolidation could be predicted by using artificial intelligence rather than based on the current resources utilization usag

Open Repository and Bibliography - Luxembourg

Managing contamination delay to improve Timing Speculation architectures

Author: Avirneni Naga Durga Prasad
Ramesh Prem Kumar
Somani Arun K.
Publication venue: Iowa State University Digital Repository
Publication date: 01/08/2016
Field of study

Timing Speculation (TS) is a widely known method for realizing better-than-worst-case systems. Aggressive clocking, realizable by TS, enable systems to operate beyond specified safe frequency limits to effectively exploit the data dependent circuit delay. However, the range of aggressive clocking for performance enhancement under TS is restricted by short paths. In this paper, we show that increasing the lengths of short paths of the circuit increases the effectiveness of TS, leading to performance improvement. Also, we propose an algorithm to efficiently add delay buffers to selected short paths while keeping down the area penalty. We present our algorithm results for ISCAS-85 suite and show that it is possible to increase the circuit contamination delay by up to 30% without affecting the propagation delay. We also explore the possibility of increasing short path delays further by relaxing the constraint on propagation delay and analyze the performance impact

Digital Repository @ Iowa State University (ISU)

Directory of Open Access Journals

Big Data and Large-scale Data Analytics: Efficiency of Sustainable Scalability and Security of Centralized Clouds and Edge Deployment Architectures

Author: Awaysheh Feras Mahmoud Naji
Publication venue
Publication date: 01/01/2020
Field of study

One of the significant shifts of the next-generation computing technologies will certainly be in the development of Big Data (BD) deployment architectures. Apache Hadoop, the BD landmark, evolved as a widely deployed BD operating system. Its new features include federation structure and many associated frameworks, which provide Hadoop 3.x with the maturity to serve different markets. This dissertation addresses two leading issues involved in exploiting BD and large-scale data analytics realm using the Hadoop platform. Namely, (i)Scalability that directly affects the system performance and overall throughput using portable Docker containers. (ii) Security that spread the adoption of data protection practices among practitioners using access controls. An Enhanced Mapreduce Environment (EME), OPportunistic and Elastic Resource Allocation (OPERA) scheduler, BD Federation Access Broker (BDFAB), and a Secure Intelligent Transportation System (SITS) of multi-tiers architecture for data streaming to the cloud computing are the main contribution of this thesis study

Repositorio Institucional da Universidade de Santiago de Compostela

Pricing the Cloud: An Auction Approach

Author: Lu Yang
Publication venue: ODU Digital Commons
Publication date: 01/04/2020
Field of study

Cloud computing has changed the processing and service modes of information communication technology and has affected the transformation, upgrading and innovation of the IT-related industry systems. The rapid development of cloud computing in business practice has spawned a whole new field of interdisciplinary, providing opportunities and challenges for business management research. One of the critical factors impacting cloud computing is how to price cloud services. An appropriate pricing strategy has important practical means to stakeholders, especially to providers and customers. This study addressed and discussed research findings on cloud computing pricing strategies, such as fixed pricing, bidding pricing, and dynamic pricing. Another key factor for cloud computing is Quality of Service (QoS), such as availability, reliability, latency, security, throughput, capacity, scalability, elasticity, etc. Cloud providers seek to improve QoS to attract more potential customers; while, customers intend to find QoS matching services that do not exceed their budget constraints. Based on the existing study, a hybrid QoS-based pricing mechanism, which consists of subscription and dynamic auction design, is proposed and illustrated to cloud services. The results indicate that our hybrid pricing mechanism has potential to better allocate available cloud resources, aiming at increasing revenues for providers and reducing expenses for customers in practice

Old Dominion University

User-centric workload analytics: Towards better cluster management

Author: Javagal Suhas Raveesh
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2016
Field of study

Effective management of computing clusters and providing a high quality customer support is not a trivial task. Due to rise of community clusters there is an increase in the diversity of workloads and the user demographic. Owing to this and privacy concerns of the user, it is difficult to identify performance issues, reduce resource wastage and understand implicit user demands. In this thesis, we perform in-depth analysis of user behavior, performance issues, resource usage patterns and failures in the workloads collected from a university-wide community cluster and two clusters maintained by a government lab. We also introduce a set of novel analysis techniques that can be used to identify many hidden patterns and diagnose performance issues. Based on our analysis, we provide concrete suggestions for the cluster administrator and present case studies highlighting how such information can be used to proactively solve many user issues, ultimately leading to better quality of service

Purdue E-Pubs