789 research outputs found

    Datacenter Traffic Control: Understanding Techniques and Trade-offs

    Get PDF
    Datacenters provide cost-effective and flexible access to scalable compute and storage resources necessary for today's cloud computing needs. A typical datacenter is made up of thousands of servers connected with a large network and usually managed by one operator. To provide quality access to the variety of applications and services hosted on datacenters and maximize performance, it deems necessary to use datacenter networks effectively and efficiently. Datacenter traffic is often a mix of several classes with different priorities and requirements. This includes user-generated interactive traffic, traffic with deadlines, and long-running traffic. To this end, custom transport protocols and traffic management techniques have been developed to improve datacenter network performance. In this tutorial paper, we review the general architecture of datacenter networks, various topologies proposed for them, their traffic properties, general traffic control challenges in datacenters and general traffic control objectives. The purpose of this paper is to bring out the important characteristics of traffic control in datacenters and not to survey all existing solutions (as it is virtually impossible due to massive body of existing research). We hope to provide readers with a wide range of options and factors while considering a variety of traffic control mechanisms. We discuss various characteristics of datacenter traffic control including management schemes, transmission control, traffic shaping, prioritization, load balancing, multipathing, and traffic scheduling. Next, we point to several open challenges as well as new and interesting networking paradigms. At the end of this paper, we briefly review inter-datacenter networks that connect geographically dispersed datacenters which have been receiving increasing attention recently and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial

    Live media production: multicast optimization and visibility for clos fabric in media data centers

    Get PDF
    Media production data centers are undergoing a major architectural shift to introduce digitization concepts to media creation and media processing workflows. Content companies such as NBC Universal, CBS/Viacom and Disney are modernizing their workflows to take advantage of the flexibility of IP and virtualization. In these new environments, multicast is utilized to provide point-to-multi-point communications. In order to build point-to-multi-point trees, Multicast has an established set of control protocols such as IGMP and PIM. The existing multicast protocols do not optimize multicast tree formation for maximizing network throughput which lead to decreased fabric utilization and decreased total number of admitted flows. In addition, existing multicast protocols are not bandwidth-aware and could cause links to over-subscribe leading to packet loss and lower video quality. TV production traffic patterns are unique due to ultra high bandwidth requirements and high sensitivity to packet loss that leads to video impairments. In such environments, operators need monitoring tools that are able to proactively monitor video flows and provide actionable alerts. Existing network monitoring tools are inadequate because they are reactive by design and perform generic monitoring of flows with no insights into video domain. The first part of this dissertation includes a design and implementation of a novel Intelligent Rendezvous Point algorithm iRP for bandwidth-aware multicast routing in media DC fabrics. iRP utilizes a controller-based architecture to optimize multicast tree formation and to increase bandwidth availability in the fabric. The system offers up to 50\% increase in fabric capacity to handle multicast flows passing through the fabric. In the second part of this dissertation, DiRP algorithm is presented. DiRP is based on a distributed decision-making approach to achieve multicast tree capacity optimization while maintaining low multicast tree setup time. DiRP algorithm is tested using commercially available data center switches. DiRP algorithm offers substantially lower path setup time compared to centralized systems while maintaining bandwidth awareness when setting up the fabric. The third part of this dissertation studies the utilization of machine learning algorithms to improve on multicast efficiency in the fabric. The work includes implementation and testing of LiRP algorithm to increase iRP\u27s fabric efficiency by implementing k-fold cross validation method to predict future multicast group memberships for time-series analysis. Testing results confirm that LiRP system increases the efficiency of iRP by up to 40\% through prediction of multicast group memberships with online arrival. In the fourth part of this dissertation, The problem of live video monitoring is studied. Existing network monitoring tools are either reactive by design or perform generic monitoring of flows with no insights into video domain. MediaFlow is a robust system for active network monitoring and reporting of video quality for thousands of flows simultaneously using a fraction of the cost of traditional monitoring solutions. MediaFlow is able to detect and report on integrity of video flows at a granularity of 100 mSec at line rate for thousands of flows. The system increases video monitoring scale by a thousand-fold compared to edge monitoring solutions

    FatPaths: Routing in Supercomputers and Data Centers when Shortest Paths Fall Short

    Full text link
    We introduce FatPaths: a simple, generic, and robust routing architecture that enables state-of-the-art low-diameter topologies such as Slim Fly to achieve unprecedented performance. FatPaths targets Ethernet stacks in both HPC supercomputers as well as cloud data centers and clusters. FatPaths exposes and exploits the rich ("fat") diversity of both minimal and non-minimal paths for high-performance multi-pathing. Moreover, FatPaths uses a redesigned "purified" transport layer that removes virtually all TCP performance issues (e.g., the slow start), and incorporates flowlet switching, a technique used to prevent packet reordering in TCP networks, to enable very simple and effective load balancing. Our design enables recent low-diameter topologies to outperform powerful Clos designs, achieving 15% higher net throughput at 2x lower latency for comparable cost. FatPaths will significantly accelerate Ethernet clusters that form more than 50% of the Top500 list and it may become a standard routing scheme for modern topologies

    Space Shuffle: A Scalable, Flexible, and High-Bandwidth Data Center Network

    Full text link
    Data center applications require the network to be scalable and bandwidth-rich. Current data center network architectures often use rigid topologies to increase network bandwidth. A major limitation is that they can hardly support incremental network growth. Recent work proposes to use random interconnects to provide growth flexibility. However routing on a random topology suffers from control and data plane scalability problems, because routing decisions require global information and forwarding state cannot be aggregated. In this paper we design a novel flexible data center network architecture, Space Shuffle (S2), which applies greedy routing on multiple ring spaces to achieve high-throughput, scalability, and flexibility. The proposed greedy routing protocol of S2 effectively exploits the path diversity of densely connected topologies and enables key-based routing. Extensive experimental studies show that S2 provides high bisectional bandwidth and throughput, near-optimal routing path lengths, extremely small forwarding state, fairness among concurrent data flows, and resiliency to network failures

    Power-Aware Datacenter Networking and Optimization

    Get PDF
    Present-day datacenter networks (DCNs) are designed to achieve full bisection bandwidth in order to provide high network throughput and server agility. However, the average utilization of typical DCN infrastructure is below 10% for significant time intervals. As a result, energy is wasted during these periods. In this thesis we analyze traffic behavior of datacenter networks using traces as well as simulated models. Based on the insight developed, we present techniques to reduce energy waste by making energy use scale linearly with load. The solutions developed are analyzed via simulations, formal analysis, and prototyping. The impact of our work is significant because the energy savings we obtain for networking infrastructure of DCNs are near optimal. A key finding of our traffic analysis is that network switch ports within the DCN are grossly under-utilized. Therefore, the first solution we study is to modify the routing within the network to force most traffic to the smallest of switches. This increases the hop count for the traffic but enables the powering off of many switch ports. The exact extent of energy savings is derived and validated using simulations. An alternative strategy we explore in this context is to replace about half the switches with fewer switches that have higher port density. This has the effect of enabling even greater traffic consolidation, thus enabling even more ports to sleep. Finally, we explore a third approach in which we begin with end-to-end traffic models and incrementally build a DCN topology that is optimized for that model. In other words, the network topology is optimized for the potential use of the datacenter. This approach makes sense because, as other researchers have observed, the traffic in a datacenter is heavily dependent on the primary use of the datacenter. A second line of research we undertake is to merge traffic in the analog domain prior to feeding it to switches. This is accomplished by use of a passive device we call a merge network. Using a merge network enables us to attain linear scaling of energy use with load regardless of datacenter traffic models. The challenge in using such a device is that layer 2 and layer 3 protocols require a one-to-one mapping of hardware addresses to IP (Internet Protocol) addresses. We overcome this problem by building a software shim layer that hides the fact that traffic is being merged. In order to validate the idea of a merge network, we build a simple mere network for gigabit optical interfaces and demonstrate correct operation at line speeds of layer 2 and layer 3 protocols. We also conducted measurements to study how traffic gets mixed in the merge network prior to being fed to the switch. We also show that the merge network uses only a fraction of a watt of power, which makes this a very attractive solution for energy efficiency. In this research we have developed solutions that enable linear scaling of energy with load in datacenter networks. The different techniques developed have been analyzed via modeling and simulations as well as prototyping. We believe that these solutions can be easily incorporated into future DCNs with little effort
    • …
    corecore