847 research outputs found

    Datacenter Traffic Control: Understanding Techniques and Trade-offs

    Get PDF
    Datacenters provide cost-effective and flexible access to scalable compute and storage resources necessary for today's cloud computing needs. A typical datacenter is made up of thousands of servers connected with a large network and usually managed by one operator. To provide quality access to the variety of applications and services hosted on datacenters and maximize performance, it deems necessary to use datacenter networks effectively and efficiently. Datacenter traffic is often a mix of several classes with different priorities and requirements. This includes user-generated interactive traffic, traffic with deadlines, and long-running traffic. To this end, custom transport protocols and traffic management techniques have been developed to improve datacenter network performance. In this tutorial paper, we review the general architecture of datacenter networks, various topologies proposed for them, their traffic properties, general traffic control challenges in datacenters and general traffic control objectives. The purpose of this paper is to bring out the important characteristics of traffic control in datacenters and not to survey all existing solutions (as it is virtually impossible due to massive body of existing research). We hope to provide readers with a wide range of options and factors while considering a variety of traffic control mechanisms. We discuss various characteristics of datacenter traffic control including management schemes, transmission control, traffic shaping, prioritization, load balancing, multipathing, and traffic scheduling. Next, we point to several open challenges as well as new and interesting networking paradigms. At the end of this paper, we briefly review inter-datacenter networks that connect geographically dispersed datacenters which have been receiving increasing attention recently and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial

    Every timestamp counts: accurate tracking of network latencies using reconcilable difference aggregator

    Get PDF
    IEEE User-facing services deployed in data centers must respond quickly to user actions. The measurement of network latencies is of paramount importance. Recently, a new family of compact data structures has been proposed to estimate one-way latencies. In order to achieve scalability, these new methods rely on timestamp aggregation. Unfortunately, this approach suffers from serious accuracy problems in the presence of packet loss and reordering, given that a single lost or out-of-order packet may invalidate a huge number of aggregated samples. In this paper, we unify the problem to detect lost and reordered packets within the set reconciliation framework. Although the set reconciliation approach and the data structures for aggregating packet timestamps are previously known, the combination of these two principles is novel. We present a space-efficient synopsis called reconcilable difference aggregator (RDA). RDA maximizes the percentage of useful packets for latency measurement by mapping packets to multiple banks and repairing aggregated samples that have been damaged by lost and reordered packets. RDA simultaneously obtains the average and the standard deviation of the latency. We provide a formal guarantee of the performance and derive optimized parameters. We further design and implement a user-space passive latency measurement system that addresses practical issues of integrating RDA into the network stack. Our extensive evaluation shows that compared with existing methods, our approach improves the relative error of the average latency estimation in 10-15 orders of magnitude, and the relative error of the standard deviation in 0.5-6 orders of magnitude.Peer ReviewedPostprint (author's final draft

    PBE-CC: Congestion Control via Endpoint-Centric, Physical-Layer Bandwidth Measurements

    Full text link
    Wireless networks are becoming ever more sophisticated and overcrowded, imposing the most delay, jitter, and throughput damage to end-to-end network flows in today's internet. We therefore argue for fine-grained mobile endpoint-based wireless measurements to inform a precise congestion control algorithm through a well-defined API to the mobile's wireless physical layer. Our proposed congestion control algorithm is based on Physical-Layer Bandwidth measurements taken at the Endpoint (PBE-CC), and captures the latest 5G New Radio innovations that increase wireless capacity, yet create abrupt rises and falls in available wireless capacity that the PBE-CC sender can react to precisely and very rapidly. We implement a proof-of-concept prototype of the PBE measurement module on software-defined radios and the PBE sender and receiver in C. An extensive performance evaluation compares PBE-CC head to head against the leading cellular-aware and wireless-oblivious congestion control protocols proposed in the research community and in deployment, in mobile and static mobile scenarios, and over busy and quiet networks. Results show 6.3% higher average throughput than BBR, while simultaneously reducing 95th percentile delay by 1.8x

    Traffic and task allocation in networks and the cloud

    Get PDF
    Communication services such as telephony, broadband and TV are increasingly migrating into Internet Protocol(IP) based networks because of the consolidation of telephone and data networks. Meanwhile, the increasingly wide application of Cloud Computing enables the accommodation of tens of thousands of applications from the general public or enterprise users which make use of Cloud services on-demand through IP networks such as the Internet. Real-Time services over IP (RTIP) have also been increasingly significant due to the convergence of network services, and the real-time needs of the Internet of Things (IoT) will strengthen this trend. Such Real-Time applications have strict Quality of Service (QoS) constraints, posing a major challenge for IP networks. The Cognitive Packet Network (CPN) has been designed as a QoS-driven protocol that addresses user-oriented QoS demands by adaptively routing packets based on online sensing and measurement. Thus in this thesis we first describe our design for a novel ``Real-Time (RT) traffic over CPN'' protocol which uses QoS goals that match the needs of voice packet delivery in the presence of other background traffic under varied traffic conditions; we present its experimental evaluation via measurements of key QoS metrics such as packet delay, delay variation (jitter) and packet loss ratio. Pursuing our investigation of packet routing in the Internet, we then propose a novel Big Data and Machine Learning approach for real-time Internet scale Route Optimisation based on Quality-of-Service using an overlay network, and evaluate is performance. Based on the collection of data sampled each 22 minutes over a large number of source-destinations pairs, we observe that intercontinental Internet Protocol (IP) paths are far from optimal with respect to metrics such as end-to-end round-trip delay. On the other hand, our machine learning based overlay network routing scheme exploits large scale data collected from communicating node pairs to select overlay paths, while it uses IP between neighbouring overlay nodes. We report measurements over a week long experiment with several million data points shows substantially better end-to-end QoS than is observed with pure IP routing. Pursuing the machine learning approach, we then address the challenging problem of dispatching incoming tasks to servers in Cloud systems so as to offer the best QoS and reliable job execution; an experimental system (the Task Allocation Platform) that we have developed is presented and used to compare several task allocation schemes, including a model driven algorithm, a reinforcement learning based scheme, and a ``sensible’’ allocation algorithm that assigns tasks to sub-systems that are observed to provide lower response time. These schemes are compared via measurements both among themselves and against a standard round-robin scheduler, with two architectures (with homogenous and heterogenous hosts having different processing capacities) and the conditions under which the different schemes offer better QoS are discussed. Since Cloud systems include both locally based servers at user premises and remote servers and multiple Clouds that can be reached over the Internet, we also describe a smart distributed system that combines local and remote Cloud facilities, allocating tasks dynamically to the service that offers the best overall QoS, and it includes a routing overlay which minimizes network delay for data transfer between Clouds. Internet-scale experiments that we report exhibit the effectiveness of our approach in adaptively distributing workload across multiple Clouds.Open Acces

    On the Interaction between TCP and the Wireless Channel in CDMA2000 Networks

    Full text link
    In this work, we conducted extensive active measurements on a large nationwide CDMA2000 1xRTT network in order to characterize the impact of both the Radio Link Protocol and more importantly, the wireless scheduler, on TCP. Our measurements include standard TCP/UDP logs, as well as detailed RF layer statistics that allow observability into RF dynamics. With the help of a robust correlation measure, normalized mutual information, we were able to quantify the impact of these two RF factors on TCP performance metrics such as the round trip time, packet loss rate, instantaneous throughput etc. We show that the variable channel rate has the larger impact on TCP behavior when compared to the Radio Link Protocol. Furthermore, we expose and rank the factors that influence the assigned channel rate itself and in particular, demonstrate the sensitivity of the wireless scheduler to the data sending rate. Thus, TCP is adapting its rate to match the available network capacity, while the rate allocated by the wireless scheduler is influenced by the sender's behavior. Such a system is best described as a closed loop system with two feedback controllers, the TCP controller and the wireless scheduler, each one affecting the other's decisions. In this work, we take the first steps in characterizing such a system in a realistic environment

    TCP over CDMA2000 Networks: A Cross-Layer Measurement Study

    Full text link
    Modern cellular channels in 3G networks incorporate sophisticated power control and dynamic rate adaptation which can have significant impact on adaptive transport layer protocols, such as TCP. Though there exists studies that have evaluated the performance of TCP over such networks, they are based solely on observations at the transport layer and hence have no visibility into the impact of lower layer dynamics, which are a key characteristic of these networks. In this work, we present a detailed characterization of TCP behavior based on cross-layer measurement of transport layer, as well as RF and MAC layer parameters. In particular, through a series of active TCP/UDP experiments and measurement of the relevant variables at all three layers, we characterize both, the wireless scheduler and the radio link protocol in a commercial CDMA2000 network and assess their impact on TCP dynamics. Somewhat surprisingly, our findings indicate that the wireless scheduler is mostly insensitive to channel quality and sector load over short timescales and is mainly affected by the transport layer data rate. Furthermore, with the help of a robust correlation measure, Normalized Mutual Information, we were able to quantify the impact of the wireless scheduler and the radio link protocol on various TCP parameters such as the round trip time, throughput and packet loss rate

    Performance-Driven Internet Path Selection

    Full text link
    Internet routing can often be sub-optimal, with the chosen routes providing worse performance than other available policy-compliant routes. This stems from the lack of visibility into route performance at the network layer. While this is an old problem, we argue that recent advances in programmable hardware finally open up the possibility of performance-aware routing in a deployable, BGP-compatible manner. We introduce ROUTESCOUT, a hybrid hardware/software system supporting performance-based routing at ISP scale. In the data plane, ROUTESCOUT leverages P4-enabled hardware to monitor performance across policy-compliant route choices for each destination, at line-rate and with a small memory footprint. ROUTESCOUT's control plane then asynchronously pulls aggregated performance metrics to synthesize a performance-aware forwarding policy. We show that ROUTESCOUT can monitor performance across most of an ISP's traffic, using only 4 MB of memory. Further, its control can flexibly satisfy a variety of operator objectives, with sub-second operating times

    An edge-queued datagram service for all datacenter traffic

    Get PDF
    Modern datacenters support a wide range of protocols and in-network switch enhancements aimed at improving performance. Unfortunately, the resulting protocols often do not coexist gracefully because they inevitably interact via queuing in the network. In this paper we describe EQDS, a new datagram service for datacenters that moves almost all of the queuing out of the core network and into the sending host. This enables it to support multiple (conflicting) higher layer protocols, while only sending packets into the network according to any receiver-driven credit scheme. EQDS can transparently speed up legacy TCP and RDMA stacks, and enables transport protocol evolution, while benefiting from future switch enhancements without needing to modify higher layer stacks. We show through simulation and multiple implementations that EQDS can reduce FCT of legacy TCP by 2x, improve the NVMeOF-RDMA throughput by 30%, and safely run TCP alongside RDMA on the same network

    FineComb: Measuring Microscopic Latency and Loss in the Presence of Reordering

    Get PDF
    Abstract—Modern stock trading and cluster applications re-quire microsecond latencies and almost no losses in data centers. This paper introduces an algorithm called FineComb that can obtain fine-grain end-to-end loss and latency measurements between edge routers in these networks. Such a mechanism can allow managers to distinguish between latencies and loss singularities caused by servers and those caused by the network. Compared to prior work, such as Lossy Difference Aggregator (LDA), that focused on switch-level latency measurements, the requirement of end-to-end latency measurements introduces the challenge of reordering that occurs commonly in IP networks due to churn. The problem is even more acute in switches across data center networks that employ multipath routing algorithms to exploit the inherent path diversity. Without proper care, a loss estimation algorithm can confound loss and reordering; further, any attempt to aggregate delay estimates in the presence of reordering results in severe errors. FineComb deals with these problems using order-agnostic packet digests and a simple new idea we call stash recovery. Our evaluation demonstrates that FineComb is orders of magnitude more accurate than LDA in loss and delay estimates in the presence of reordering. Index Terms—Passive measurement, latency, packet loss, re-ordering, algorithms. I

    Challenges Using the Linux Network Stack for Real-Time Communication

    Get PDF
    Starting in the early 2000s, human-in-the-loop (HITL) simulation groups at NASA and the Air Force Research Lab began using the Linux network stack for some real-time communication. More recently, SpaceX has adopted Ethernet as the primary bus technology for its Falcon launch vehicles and Dragon capsules. As the Linux network stack makes its way from ground facilities to flight critical systems, it is necessary to recognize that the network stack is optimized for communication over the open Internet, which cannot provide latency guarantees. The Internet protocols and their implementation in the Linux network stack contain numerous design decisions that favor throughput over determinism and latency. These decisions often require workarounds in the application or customization of the stack to maintain a high probability of low latency on closed networks, especially if the network must be fault tolerant to single event upsets
    • …
    corecore