8 research outputs found

    THE ADVANTAGE OF GENETIC ALGORITHM IN ENERGY-EFFICIENT SCHEDULING FOR HETEROGENEOUS CLOUD COMPUTING

    Get PDF
    Nowadays Energy Consumption has been a heavy burden on the enterprise cloud computing infrastructure. This paper focuses on the hardware factors in energy consumption. Inspired by DVFS, it proposes a new energy-efficient (EE) model. This paper formulates the scheduling problem and genetic algorithm is applied to obtain higher efficiency value. Simulations are implemented to verify the advantage of genetic algorithm. In addition, the robustness of our strategy is validated by modifying the relevant parameters of the experimen

    On Understanding the Energy Impact of Speculative Execution in Hadoop

    Get PDF
    International audienceHadoop emerged as an important system for large- scale data analysis. Speculative execution is a key feature in Hadoop that is extensively leveraged in clouds: it is used to mask slow tasks (i.e., stragglers) — resulted from resource contention and heterogeneity in clouds — by launching speculative task copies on other machines. However, speculative execution is not cost-free and may result in performance degradation and extra resource and energy consumption. While prior literature has been dedicated to improving stragglers detection to cope with the inevitable heterogeneity in clouds, little work is focusing on understanding the implications of speculative execution on the performance and energy consumption in Hadoop cluster. In this paper, we have designed a set of experiments to evaluate the impact of speculative execution on the performance and energy consumption of Hadoop in homo- and heterogeneous environments. Our studies reveal that speculative execution may sometimes reduce, sometimes increase the energy consumption of Hadoop clusters. This strongly depends on the reduction in the execution time of MapReduce applications and on the extra power consumption introduced by speculative execution. Moreover, we show that the extra power consumption varies in-between applications and is contributed to by three main factors: the duration of speculative tasks, the idle time, and the allocation of speculative tasks. To the best of our knowledge, our work provides the first deep look into the energy efficiency of speculative execution in Hadoop

    A Game Theory Approach to Fair and Efficient Resource Allocation in Cloud Computing

    Get PDF
    On-demand resource management is a key characteristic of cloud computing. Cloud providers should support the computational resource sharing in a fair way to ensure that no user gets much better resources than others. Another goal is to improve the resource utilization by minimizing the resource fragmentation when mapping virtual machines to physical servers. The focus of this paper is the proposal of a game theoretic resources allocation algorithm that considers the fairness among users and the resources utilization for both. The experiments with an FUGA implementation on an 8-node server cluster show the optimality of this algorithm in keeping fairness by comparing with the evaluation of the Hadoop scheduler. The simulations based on Google workload trace demonstrate that the algorithm is able to reduce resource wastage and achieve a better resource utilization rate than other allocation mechanisms

    Energy Efficiency and I/O Performance of Low-Power Architectures

    Get PDF
    International audienceThis paper presents an energy efficiency and I/O performance analysis of low-power archi-tectures when compared to conventional architectures, with the goal of studying the viability of using them as storage servers. Our results show that despite the fact the power demand of the storage device amounts for a small fraction of the power demand of the whole system, significant increases in power demand are observed when accessing the storage device. We investigate the access pattern impact on power demand, looking at the whole system and at the storage device by itself, and compare all tested configurations regarding energy efficiency. Then we extrapolate the conclusions from this research to provide guidelines for when considering the replacement of traditional storage servers by low-power alternatives. We show the choice depends on the expected workload, estimates of power demand of the systems, and factors limiting performance. These guidelines can be applied for other architectures than the ones used in this work

    Scheduling in Mapreduce Clusters

    Get PDF
    MapReduce is a framework proposed by Google for processing huge amounts of data in a distributed environment. The simplicity of the programming model and the fault-tolerance feature of the framework make it very popular in Big Data processing. As MapReduce clusters get popular, their scheduling becomes increasingly important. On one hand, many MapReduce applications have high performance requirements, for example, on response time and/or throughput. On the other hand, with the increasing size of MapReduce clusters, the energy-efficient scheduling of MapReduce clusters becomes inevitable. These scheduling challenges, however, have not been systematically studied. The objective of this dissertation is to provide MapReduce applications with low cost and energy consumption through the development of scheduling theory and algorithms, energy models, and energy-aware resource management. In particular, we will investigate energy-efficient scheduling in hybrid CPU-GPU MapReduce clusters. This research work is expected to have a breakthrough in Big Data processing, particularly in providing green computing to Big Data applications such as social network analysis, medical care data mining, and financial fraud detection. The tools we propose to develop are expected to increase utilization and reduce energy consumption for MapReduce clusters. In this PhD dissertation, we propose to address the aforementioned challenges by investigating and developing 1) a match-making scheduling algorithm for improving the data locality of Map- Reduce applications, 2) a real-time scheduling algorithm for heterogeneous Map- Reduce clusters, and 3) an energy-efficient scheduler for hybrid CPU-GPU Map- Reduce cluster. Advisers: Ying Lu and David Swanso

    Performance optimization and energy efficiency of big-data computing workflows

    Get PDF
    Next-generation e-science is producing colossal amounts of data, now frequently termed as Big Data, on the order of terabyte at present and petabyte or even exabyte in the predictable future. These scientific applications typically feature data-intensive workflows comprised of moldable parallel computing jobs, such as MapReduce, with intricate inter-job dependencies. The granularity of task partitioning in each moldable job of such big data workflows has a significant impact on workflow completion time, energy consumption, and financial cost if executed in clouds, which remains largely unexplored. This dissertation conducts an in-depth investigation into the properties of moldable jobs and provides an experiment-based validation of the performance model where the total workload of a moldable job increases along with the degree of parallelism. Furthermore, this dissertation conducts rigorous research on workflow execution dynamics in resource sharing environments and explores the interactions between workflow mapping and task scheduling on various computing platforms. A workflow optimization architecture is developed to seamlessly integrate three interrelated technical components, i.e., resource allocation, job mapping, and task scheduling. Cloud computing provides a cost-effective computing platform for big data workflows where moldable parallel computing models are widely applied to meet stringent performance requirements. Based on the moldable parallel computing performance model, a big-data workflow mapping model is constructed and a workflow mapping problem is formulated to minimize workflow makespan under a budget constraint in public clouds. This dissertation shows this problem to be strongly NP-complete and designs i) a fully polynomial-time approximation scheme for a special case with a pipeline-structured workflow executed on virtual machines of a single class, and ii) a heuristic for a generalized problem with an arbitrary directed acyclic graph-structured workflow executed on virtual machines of multiple classes. The performance superiority of the proposed solution is illustrated by extensive simulation-based results in Hadoop/YARN in comparison with existing workflow mapping models and algorithms. Considering that large-scale workflows for big data analytics have become a main consumer of energy in data centers, this dissertation also delves into the problem of static workflow mapping to minimize the dynamic energy consumption of a workflow request under a deadline constraint in Hadoop clusters, which is shown to be strongly NP-hard. A fully polynomial-time approximation scheme is designed for a special case with a pipeline-structured workflow on a homogeneous cluster and a heuristic is designed for the generalized problem with an arbitrary directed acyclic graph-structured workflow on a heterogeneous cluster. This problem is further extended to a dynamic version with deadline-constrained MapReduce workflows to minimize dynamic energy consumption in Hadoop clusters. This dissertation proposes a semi-dynamic online scheduling algorithm based on adaptive task partitioning to reduce dynamic energy consumption while meeting performance requirements from a global perspective, and also develops corresponding system modules for algorithm implementation in the Hadoop ecosystem. The performance superiority of the proposed solutions in terms of dynamic energy saving and deadline missing rate is illustrated by extensive simulation results in comparison with existing algorithms, and further validated through real-life workflow implementation and experiments using the Oozie workflow engine in Hadoop/YARN systems

    Energy Saving in QoS Fog-supported Data Centers

    Get PDF
    One of the most important challenges that cloud providers face in the explosive growth of data is to reduce the energy consumption of their designed, modern data centers. The majority of current research focuses on energy-efficient resources management in the infrastructure as a service (IaaS) model through "resources virtualization" - virtual machines and physical machines consolidation. However, actual virtualized data centers are not supporting communication–computing intensive real-time applications, big data stream computing (info-mobility applications, real-time video co-decoding). Indeed, imposing hard-limits on the overall per-job computing-plus-communication delays forces the overall networked computing infrastructure to quickly adopt its resource utilization to the (possibly, unpredictable and abrupt) time fluctuations of the offered workload. Recently, Fog Computing centers are as promising commodities in Internet virtual computing platform that raising the energy consumption and making the critical issues on such platform. Therefore, it is expected to present some green solutions (i.e., support energy provisioning) that cover fog-supported delay-sensitive web applications. Moreover, the usage of traffic engineering-based methods dynamically keep up the number of active servers to match the current workload. Therefore, it is desirable to develop a flexible, reliable technological paradigm and resource allocation algorithm to pay attention the consumed energy. Furthermore, these algorithms could automatically adapt themselves to time-varying workloads, joint reconfiguration, and orchestration of the virtualized computing-plus-communication resources available at the computing nodes. Besides, these methods facilitate things devices to operate under real-time constraints on the allowed computing-plus-communication delay and service latency. The purpose of this thesis is: i) to propose a novel technological paradigm, the Fog of Everything (FoE) paradigm, where we detail the main building blocks and services of the corresponding technological platform and protocol stack; ii) propose a dynamic and adaptive energy-aware algorithm that models and manages virtualized networked data centers Fog Nodes (FNs), to minimize the resulting networking-plus-computing average energy consumption; and, iii) propose a novel Software-as-a-Service (SaaS) Fog Computing platform to integrate the user applications over the FoE. The emerging utilization of SaaS Fog Computing centers as an Internet virtual computing commodity is to support delay-sensitive applications. The main blocks of the virtualized Fog node, operating at the Middleware layer of the underlying protocol stack and comprises of: i) admission control of the offered input traffic; ii) balanced control and dispatching of the admitted workload; iii) dynamic reconfiguration and consolidation of the Dynamic Voltage and Frequency Scaling (DVFS)-enabled Virtual Machines (VMs) instantiated onto the parallel computing platform; and, iv) rate control of the traffic injected into the TCP/IP connection. The salient features of this algorithm are that: i) it is adaptive and admits distributed scalable implementation; ii) it has the capacity to provide hard QoS guarantees, in terms of minimum/maximum instantaneous rate of the traffic delivered to the client, instantaneous goodput and total processing delay; and, iii) it explicitly accounts for the dynamic interaction between computing and networking resources in order to maximize the resulting energy efficiency. Actual performance of the proposed scheduler in the presence of: i) client mobility; ii) wireless fading; iii) reconfiguration and two-thresholds consolidation costs of the underlying networked computing platform; and, iv) abrupt changes of the transport quality of the available TCP/IP mobile connection, is numerically tested and compared to the corresponding ones of some state-of-the-art static schedulers, under both synthetically generated and measured real-world workload traces
    corecore