5,418 research outputs found

    Datacenter Traffic Control: Understanding Techniques and Trade-offs

    Get PDF
    Datacenters provide cost-effective and flexible access to scalable compute and storage resources necessary for today's cloud computing needs. A typical datacenter is made up of thousands of servers connected with a large network and usually managed by one operator. To provide quality access to the variety of applications and services hosted on datacenters and maximize performance, it deems necessary to use datacenter networks effectively and efficiently. Datacenter traffic is often a mix of several classes with different priorities and requirements. This includes user-generated interactive traffic, traffic with deadlines, and long-running traffic. To this end, custom transport protocols and traffic management techniques have been developed to improve datacenter network performance. In this tutorial paper, we review the general architecture of datacenter networks, various topologies proposed for them, their traffic properties, general traffic control challenges in datacenters and general traffic control objectives. The purpose of this paper is to bring out the important characteristics of traffic control in datacenters and not to survey all existing solutions (as it is virtually impossible due to massive body of existing research). We hope to provide readers with a wide range of options and factors while considering a variety of traffic control mechanisms. We discuss various characteristics of datacenter traffic control including management schemes, transmission control, traffic shaping, prioritization, load balancing, multipathing, and traffic scheduling. Next, we point to several open challenges as well as new and interesting networking paradigms. At the end of this paper, we briefly review inter-datacenter networks that connect geographically dispersed datacenters which have been receiving increasing attention recently and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial

    A greedy heuristic approach for the project scheduling with labour allocation problem

    Get PDF
    Responding to the growing need of generating a robust project scheduling, in this article we present a greedy algorithm to generate the project baseline schedule. The robustness achieved by integrating two dimensions of the human resources flexibilities. The first is the operators’ polyvalence, i.e. each operator has one or more secondary skill(s) beside his principal one, his mastering level being characterized by a factor we call “efficiency”. The second refers to the working time modulation, i.e. the workers have a flexible time-table that may vary on a daily or weekly basis respecting annualized working strategy. Moreover, the activity processing time is a non-increasing function of the number of workforce allocated to create it, also of their heterogynous working efficiencies. This modelling approach has led to a nonlinear optimization model with mixed variables. We present: the problem under study, the greedy algorithm used to solve it, and then results in comparison with those of the genetic algorithms

    Proposed Energy Aware Scheduling Algorithm in Data Center by using Map Reduce

    Get PDF
    The majority of large-scale data intensive applications executed by data centers are based on MapReduce or its open-source implementation, Hadoop. Such applications are executed on large clusters requiring large amounts of energy, making the energy costs a considerable fraction of the data center's overall costs. Therefore minimizing the energy consumption when executing each MapReduce job is a critical concern for data centers. We propose a framework for improving the energy efficiency of MapReduce applications, while satisfying the service level agreement (SLA). We first model the problem of energy-aware scheduling of a single MapReduce job as an Integer Program. We then propose two heuristic algorithms, called Energy-aware MapReduce Scheduling Algorithms (EMRSA-I and EMRSA-II), that find the assignments of map and reduce tasks to the machine slots in order to minimize the energy consumed when executing the application. We perform extensive experiments on a Hadoop cluster to determine the energy consumption and execution time for several workloads from the HiBench benchmark suite including TeraSort, PageRank, and K-means Clustering, and then use this data in an extensive simulation study to analyze the performance of the proposed algorithms. The results show that EMRSA-I and EMRSA-II are able to find near optimal job schedules consuming approximately 40% less energy on average than the schedules obtained by a common practice scheduler that minimizes the makespan

    Contributions to Edge Computing

    Get PDF
    Efforts related to Internet of Things (IoT), Cyber-Physical Systems (CPS), Machine to Machine (M2M) technologies, Industrial Internet, and Smart Cities aim to improve society through the coordination of distributed devices and analysis of resulting data. By the year 2020 there will be an estimated 50 billion network connected devices globally and 43 trillion gigabytes of electronic data. Current practices of moving data directly from end-devices to remote and potentially distant cloud computing services will not be sufficient to manage future device and data growth. Edge Computing is the migration of computational functionality to sources of data generation. The importance of edge computing increases with the size and complexity of devices and resulting data. In addition, the coordination of global edge-to-edge communications, shared resources, high-level application scheduling, monitoring, measurement, and Quality of Service (QoS) enforcement will be critical to address the rapid growth of connected devices and associated data. We present a new distributed agent-based framework designed to address the challenges of edge computing. This actor-model framework implementation is designed to manage large numbers of geographically distributed services, comprised from heterogeneous resources and communication protocols, in support of low-latency real-time streaming applications. As part of this framework, an application description language was developed and implemented. Using the application description language a number of high-order management modules were implemented including solutions for resource and workload comparison, performance observation, scheduling, and provisioning. A number of hypothetical and real-world use cases are described to support the framework implementation

    Scheduling Periodical Multi-Stage Jobs With Fuzziness to Elastic Cloud Resources

    Full text link
    © 2020 IEEE. Personal use of this material is permitted. Permissíon from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertisíng or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.[EN] We investigate a workflow scheduling problem with stochastic task arrival times and fuzzy task processing times and due dates. The problem is common in many real-time and workflow-based applications, where tasks with fixed stage number and linearly dependency are executed on scalable cloud resources with multiple price options. The challenges lie in proposing effective, stable, and robust algorithms under stochastic and fuzzy tasks. A triangle fuzzy number-based model is formulated. Two metrics are explored: the cost and the degree of satisfaction. An iterated heuristic framework is proposed to periodically schedule tasks, which consists of a task collection and a fuzzy task scheduling phases. Two task collection strategies are presented and two task prioritization strategies are employed. In order to achieve a high satisfaction degree, deadline constraints are defined at both job and task levels. By designing delicate experiments and applying sophisticated statistical techniques, experimental results show that the proposed algorithm is more effective and robust than the two existing methods.This work was supported by the National Key Research and Development Program of China (No. 2017YFB1400800), the National Natural Science Foundation of China (Nos. 61672297, 61872077, and 61832004), the Natural Science Foundation of the Jiangsu Higher Education Institutions of China (Grant No. 18KJB520039) and the National Science Foundation for Post-doctoral Scientists of China (Grant No. 2018M640510). Ruben Ruiz was partially supported by the Spanish Ministry of Science, Innovation, and Universities, under the project "OPTEP-Port Terminal Operations Optimization" (No. RTI2018-094940-B-I00) financed with FEDER funds. The authors would like to thank the anonymous reviewers for their valuable feedback on this work.Zhu, J.; Li, X.; Ruiz García, R.; Li, W.; Huang, H.; Zomaya, AY. (2020). Scheduling Periodical Multi-Stage Jobs With Fuzziness to Elastic Cloud Resources. IEEE Transactions on Parallel and Distributed Systems. 31(12):2819-2833. https://doi.org/10.1109/TPDS.2020.3004134S28192833311

    A Graph-Partition-Based Scheduling Policy for Heterogeneous Architectures

    Full text link
    In order to improve system performance efficiently, a number of systems choose to equip multi-core and many-core processors (such as GPUs). Due to their discrete memory these heterogeneous architectures comprise a distributed system within a computer. A data-flow programming model is attractive in this setting for its ease of expressing concurrency. Programmers only need to define task dependencies without considering how to schedule them on the hardware. However, mapping the resulting task graph onto hardware efficiently remains a challenge. In this paper, we propose a graph-partition scheduling policy for mapping data-flow workloads to heterogeneous hardware. According to our experiments, our graph-partition-based scheduling achieves comparable performance to conventional queue-base approaches.Comment: Presented at DATE Friday Workshop on Heterogeneous Architectures and Design Methods for Embedded Image Systems (HIS 2015) (arXiv:1502.07241

    Challenges in real-time virtualization and predictable cloud computing

    Get PDF
    Cloud computing and virtualization technology have revolutionized general-purpose computing applications in the past decade. The cloud paradigm offers advantages through reduction of operation costs, server consolidation, flexible system configuration and elastic resource provisioning. However, despite the success of cloud computing for general-purpose computing, existing cloud computing and virtualization technology face tremendous challenges in supporting emerging soft real-time applications such as online video streaming, cloud-based gaming, and telecommunication management. These applications demand real-time performance in open, shared and virtualized computing environments. This paper identifies the technical challenges in supporting real-time applications in the cloud, surveys recent advancement in real-time virtualization and cloud computing technology, and offers research directions to enable cloud-based real-time applications in the future

    Learning Scheduling Algorithms for Data Processing Clusters

    Full text link
    Efficiently scheduling data processing jobs on distributed compute clusters requires complex algorithms. Current systems, however, use simple generalized heuristics and ignore workload characteristics, since developing and tuning a scheduling policy for each workload is infeasible. In this paper, we show that modern machine learning techniques can generate highly-efficient policies automatically. Decima uses reinforcement learning (RL) and neural networks to learn workload-specific scheduling algorithms without any human instruction beyond a high-level objective such as minimizing average job completion time. Off-the-shelf RL techniques, however, cannot handle the complexity and scale of the scheduling problem. To build Decima, we had to develop new representations for jobs' dependency graphs, design scalable RL models, and invent RL training methods for dealing with continuous stochastic job arrivals. Our prototype integration with Spark on a 25-node cluster shows that Decima improves the average job completion time over hand-tuned scheduling heuristics by at least 21%, achieving up to 2x improvement during periods of high cluster load
    • …
    corecore