41 research outputs found

    Weighted Scheduling of Time-Sensitive Coflows

    Full text link
    Datacenter networks commonly facilitate the transmission of data in distributed computing frameworks through coflows, which are collections of parallel flows associated with a common task. Most of the existing research has concentrated on scheduling coflows to minimize the time required for their completion, i.e., to optimize the average dispatch rate of coflows in the network fabric. Nevertheless, modern applications often produce coflows that are specifically intended for online services and mission-crucial computational tasks, necessitating adherence to specific deadlines for their completion. In this paper, we introduce \wdcoflow,~ a new algorithm to maximize the weighted number of coflows that complete before their deadline. By combining a dynamic programming algorithm along with parallel inequalities, our heuristic solution performs at once coflow admission control and coflow prioritization, imposing a σ\sigma-order on the set of coflows. With extensive simulation, we demonstrate the effectiveness of our algorithm in improving up to 3×3\times more coflows that meet their deadline in comparison the best SoA solution, namely CS-MHA\mathtt{CS\text{-}MHA}. Furthermore, when weights are used to differentiate coflow classes, \wdcoflow~ is able to improve the admission per class up to 4×4\times, while increasing the average weighted coflow admission rate.Comment: Submitted to IEEE Transactions on Cloud Computing. Parts of this work have been presented at IFIP Networking 202

    Flow-time Optimization for Concurrent Open-Shop and Precedence Constrained Scheduling Models

    Get PDF
    Scheduling a set of jobs over a collection of machines is a fundamental problem that needs to be solved millions of times a day in various computing platforms: in operating systems, in large data clusters, and in data centers. Along with makespan, flow-time, which measures the length of time a job spends in a system before it completes, is arguably the most important metric to measure the performance of a scheduling algorithm. In recent years, there has been a remarkable progress in understanding flow-time based objective functions in diverse settings such as unrelated machines scheduling, broadcast scheduling, multi-dimensional scheduling, to name a few. Yet, our understanding of the flow-time objective is limited mostly to the scenarios where jobs have no dependencies. On the other hand, in almost all real world applications, think of MapReduce settings for example, jobs have dependencies that need to be respected while making scheduling decisions. In this paper, we take first steps towards understanding this complex problem. In particular, we consider two classical scheduling problems that capture dependencies across jobs: 1) concurrent open-shop scheduling (COSSP) and 2) precedence constrained scheduling. Our main motivation to study these problems specifically comes from their relevance to two scheduling problems that have gained importance in the context of data centers: co-flow scheduling and DAG scheduling. We design almost optimal approximation algorithms for COSSP and PCSP, and show hardness results

    Fair Coflow Scheduling via Controlled Slowdown

    Get PDF
    The average coflow completion time (CCT) is the standard performance metric in coflow scheduling. However, standard CCT minimization may introduce unfairness between the data transfer phase of different computing jobs. Thus, while progress guarantees have been introduced in the literature to mitigate this fairness issue, the trade-off between fairness and efficiency of data transfer is hard to control. This paper introduces a fairness framework for coflow scheduling based on the concept of slowdown, i.e., the performance loss of a coflow compared to isolation. By controlling the slowdown it is possible to enforce a target coflow progress while minimizing the average CCT. In the proposed framework, the minimum slowdown for a batch of coflows can be determined in polynomial time. By showing the equivalence with Gaussian elimination, slowdown constraints are introduced into primal-dual iterations of the CoFair algorithm. The algorithm extends the class of the σ-order schedulers to solve the fair coflow scheduling problem in polynomial time. It provides a 4-approximation of the average CCT w.r.t. an optimal scheduler. Extensive numerical results demonstrate that this approach can trade off average CCT for slowdown more efficiently than existing state of the art schedulers

    Network flow optimization for distributed clouds

    Get PDF
    Internet applications, which rely on large-scale networked environments such as data centers for their back-end support, are often geo-distributed and typically have stringent performance constraints. The interconnecting networks, within and across data centers, are critical in determining these applications' performance. Data centers can be viewed as composed of three layers: physical infrastructure consisting of servers, switches, and links, control platforms that manage the underlying resources, and applications that run on the infrastructure. This dissertation shows that network flow optimization can improve performance of distributed applications in the cloud by designing high-throughput schemes spanning all three layers. At the physical infrastructure layer, we devise a framework for measuring and understanding throughput of network topologies. We develop a heuristic for estimating the worst-case performance of any topology and propose a systematic methodology for comparing performance of networks built with different equipment. At the control layer, we put forward a source-routed data center fabric which can achieve near-optimal throughput performance by leveraging a large number of available paths while using limited memory in switches. At the application layer, we show that current Application Network Interfaces (ANIs), abstractions that translate an application's performance goals to actionable network objectives, fail to capture the requirements of many emerging applications. We put forward a novel ANI that can capture application intent more effectively and quantify performance gains achievable with it. We also tackle resource optimization in the inter-data center context of cellular providers. In this emerging environment, a large amount of resources are geographically fragmented across thousands of micro data centers, each with a limited share of resources, necessitating cross-application optimization to satisfy diverse performance requirements and improve network and server utilization. Our solution, Patronus, employs hierarchical optimization for handling multiple performance requirements and temporally partitioned scheduling for scalability

    Saba: Rethinking Datacenter Network Allocation from Application’s Perspective

    Get PDF

    Conserve and Protect Resources in Software-Defined Networking via the Traffic Engineering Approach

    Get PDF
    Software Defined Networking (SDN) is revolutionizing the architecture and operation of computer networks and promises a more agile and cost-efficient network management. SDN centralizes the network control logic and separates the control plane from the data plane, thus enabling flexible management of networks. A network based on SDN consists of a data plane and a control plane. To assist management of devices and data flows, a network also has an independent monitoring plane. These coexisting network planes have various types of resources, such as bandwidth utilized to transmit monitoring data, energy spent to power data forwarding devices and computational resources to control a network. Unwise management, even abusive utilization of these resources lead to the degradation of the network performance and increase the Operating Expenditure (Opex) of the network owner. Conserving and protecting limited network resources is thus among the key requirements for efficient networking. However, the heterogeneity of the network hardware and network traffic workloads expands the configuration space of SDN, making it a challenging task to operate a network efficiently. Furthermore, the existing approaches usually lack the capability to automatically adapt network configurations to handle network dynamics and diverse optimization requirements. Addtionally, a centralized SDN controller has to run in a protected environment against certain attacks. This thesis builds upon the centralized management capability of SDN, and uses cross-layer network optimizations to perform joint traffic engineering, e.g., routing, hardware and software configurations. The overall goal is to overcome the management complexities in conserving and protecting resources in multiple functional planes in SDN when facing network heterogeneities and system dynamics. This thesis presents four contributions: (1) resource-efficient network monitoring, (2) resource-efficient data forwarding, (3) using self-adaptive algorithms to improve network resource efficiency, and (4) mitigating abusive usage of resources for network controlling. The first contribution of this thesis is a resource-efficient network monitoring solution. In this thesis, we consider one specific type of virtual network management function: flow packet inspection. This type of the network monitoring application requires to duplicate packets of target flows and send them to packet monitors for in-depth analysis. To avoid the competition for resources between the original data and duplicated data, the network operators can transmit the data flows through physically (e.g., different communication mediums) or virtually (e.g., distinguished network slices) separated channels having different resource consumption properties. We propose the REMO solution, namely Resource Efficient distributed Monitoring, to reduce the overall network resource consumption incurred by both types of data, via jointly considering the locations of the packet monitors, the selection of devices forking the data packets, and flow path scheduling strategies. In the second contribution of this thesis, we investigate the resource efficiency problem in hybrid, server-centric data center networks equipped with both traditional wired connections (e.g., InfiniBand or Ethernet) and advanced high-data-rate wireless links (e.g., directional 60GHz wireless technology). The configuration space of hybrid SDN equipped with both wired and wireless communication technologies is massively large due to the complexity brought by the device heterogeneity. To tackle this problem, we present the ECAS framework to reduce the power consumption and maintain the network performance. The approaches based on the optimization models and heuristic algorithms are considered as the traditional way to reduce the operation and facility resource consumption in SDN. These approaches are either difficult to directly solve or specific for a particular problem space. As the third contribution of this thesis, we investigates the approach of using Deep Reinforcement Learning (DRL) to improve the adaptivity of the management modules for network resource and data flow scheduling. The goal of the DRL agent in the SDN network is to reduce the power consumption of SDN networks without severely degrading the network performance. The fourth contribution of this thesis is a protection mechanism based upon flow rate limiting to mitigate abusive usage of the SDN control plane resource. Due to the centralized architecture of SDN and its handling mechanism for new data flows, the network controller can be the failure point due to the crafted cyber-attacks, especially the Control-Plane- Saturation (CPS) attack. We proposes an In-Network Flow mAnagement Scheme (INFAS) to effectively reduce the generation of malicious control packets depending on the parameters configured for the proposed mitigation algorithm. In summary, the contributions of this thesis address various unique challenges to construct resource-efficient and secure SDN. This is achieved by designing and implementing novel and intelligent models and algorithms to configure networks and perform network traffic engineering, in the protected centralized network controller
    corecore