Search CORE

5,086 research outputs found

Micro load balancing with delayed queue lengths

Author: Tariq Fatima
Publication venue
Publication date: 01/05/2019
Field of study

DRILL is a micro load balancing algorithm designed to efficiently utilize the path redundancy in modern data centers. It uses egress port queue lengths to make fast packet routing decisions to reduce upstream congestion and queueing delays. However, high performance switches with multiple forwarding engines making routing decisions in parallel, do not have direct access to these queue lengths. We explore and evaluate different ways of obtaining this information in data center settings, specifically using incoming traffic and specially generated update packets to piggyback this information. We find that staleness of this data does not have a huge impact on flow completion times compared to DRILL (6% increase) and still achieves a considerable advantage over ECMP (28% decrease)

Illinois Digital Environment for Access to Learning and Scholarship Repository

Datacenter Traffic Control: Understanding Techniques and Trade-offs

Author: Noormohammadpour Mohammad
Raghavendra Cauligi S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/12/2017
Field of study

Datacenters provide cost-effective and flexible access to scalable compute and storage resources necessary for today's cloud computing needs. A typical datacenter is made up of thousands of servers connected with a large network and usually managed by one operator. To provide quality access to the variety of applications and services hosted on datacenters and maximize performance, it deems necessary to use datacenter networks effectively and efficiently. Datacenter traffic is often a mix of several classes with different priorities and requirements. This includes user-generated interactive traffic, traffic with deadlines, and long-running traffic. To this end, custom transport protocols and traffic management techniques have been developed to improve datacenter network performance. In this tutorial paper, we review the general architecture of datacenter networks, various topologies proposed for them, their traffic properties, general traffic control challenges in datacenters and general traffic control objectives. The purpose of this paper is to bring out the important characteristics of traffic control in datacenters and not to survey all existing solutions (as it is virtually impossible due to massive body of existing research). We hope to provide readers with a wide range of options and factors while considering a variety of traffic control mechanisms. We discuss various characteristics of datacenter traffic control including management schemes, transmission control, traffic shaping, prioritization, load balancing, multipathing, and traffic scheduling. Next, we point to several open challenges as well as new and interesting networking paradigms. At the end of this paper, we briefly review inter-datacenter networks that connect geographically dispersed datacenters which have been receiving increasing attention recently and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial

arXiv.org e-Print Archive

ZENODO

FigShare

Congestion-Aware Multistage Packet-Switch Architecture for Data Center Networks

Author: Hassen F
Mhamdi L
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 06/02/2017
Field of study

Data Center Networks (DCNs) have gone through major evolutionary changes over the past decades. Yet, it is still difficult to predict loads fluctuation and congestion spikes in the network switching fabric. Conventional multistage switches/routers used in data center fabrics barely deal with load balancing. Congestion management is often processed at the edge modules. However, neither the architecture of switches/routers, nor their inner routing algorithms tend to consider traffic balancing and congestion management. In this paper, we propose a flexible design of a scalable multistage switch with crossconnected UniDirectional Network-on-Chip based central blocs (UDNs). We also introduce a congestion-aware routing to forward packets adaptively. We compare the current switch architecture to the state-of-the art previous multistage switches under different traffic types. Simulations of various switch settings have shown that the proposed architecture maintains high throughput and low latency performance

White Rose Research Online

Howdah: Load Profiling via In-Band Flow Classification and P4

Author: Angi Antonino
Clemm Alexander
Esposito Flavio
Marchetto Guido
Sacco Alessio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2022
Field of study

The challenges of managing datacenter traffic increase with the complexity and variety of new Internet and Web applications. Efficient network management systems are often required to thwart delays and minimize failures. In this regard, it appears helpful to identify in advance the different classes of flows that (co)exist in the network, characterizing them into different types according to the different latency/bandwidth requirements. In this paper, we propose Howdah, a traffic identification and profiling mechanism that uses Machine Learning and a congestion-aware forwarding strategy to offer adaptation to different traffic classes with the support of programmable data-planes. With Howdah, sender and gateway elements inject in-band traffic information obtained using supervised learning. When a switch or a router receives a packet, it exploits such host-based traffic classification to adapt to a desirable traffic profile, for example, balancing the load. We compare our solutions against recent traffic engineering solutions and show the efficacy of cooperation between host traffic classification and P4-based switch forwarding policies, reducing packet transmission time in datacenter scenarios

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Squeezing the most benefit from network parallelism in datacenters

Author: Ghorbani Khaledi Soudeh
Publication venue
Publication date: 01/12/2016
Field of study

One big non-blocking switch is one of the most powerful and pervasive abstractions in datacenter networking. As Moore's law begins to wane, using parallelism to scale out processing units, vs. scale them up, is becoming exceedingly popular. The one-big-switch abstraction, for example, is typically implemented via leveraging massive degrees of parallelism behind the scene. In particular, in today's datacenters that exhibit a high degree of multi-pathing, each logical path between a communicating pair in the one-big-switch abstraction is mapped to a set of paths that can carry traffic in parallel. Similarly, each one-big-switch abstraction function, such as the firewall functionality, is mapped to a set of distributed hardware and software switches. Efficiently deploying this pool of networking connectivity and preserving the functional correctness of network functions, in spite of the parallelism, are challenging. Efficiently balancing the load among multiple paths is challenging because microbursts, responsible for the majority of packet loss in datacenters today, usually last for only a few microseconds. Even the fastest traffic engineering schemes today have control loops that are several orders of magnitude slower (a few milliseconds to a few seconds), and are therefore ineffective in controlling microbursts. Correctly implementing network functions in the face of parallelism is hard because the distributed set of elements that in parallel implement a one-big-switch abstraction can inevitably have inconsistent states that may cause them to behave differently than one physical switch. The first part of this thesis presents DRILL, a datacenter fabric for Clos networks which performs micro load balancing to distribute load as evenly as possible on microsecond timescales. To achieve this, DRILL employs packet-level decisions at each switch based on local queue occupancies and randomized algorithms to distribute load. Despite making per-packet forwarding decisions, by enforcing a tight control on queue occupancies, DRILL manages to keep the degree of packet reordering low. DRILL adapts to topological asymmetry (e.g. failures) in Clos networks by decomposing the network into symmetric components. Using a detailed switch hardware model, we simulate DRILL and show it outperforms recent edge-based load balancers particularly in the tail latency under heavy load, e.g., under 80% load, it reduces the 99.99th percentile of flow completion times of Presto and CONGA by 32% and 35%, respectively. Finally, we analyze DRILL's stability and throughput-efficiency. In the second part, we focus on the correctness of one-big-switch abstraction's implementation. We first show that naively using parallelism to scale networking elements can cause incorrect behavior. For example, we show that an IDS system which operates correctly as a single network element can erroneously and permanently block hosts when it is replicated. We then provide a system, COCONUT, for seamless scale-out of network forwarding elements; that is, an SDN application programmer can program to what functionally appears to be a single forwarding element, but which may be replicated behind the scenes. To do this, we identify the key property for seamless scale out, weak causality, and guarantee it through a practical and scalable implementation of vector clocks in the data plane. We build a prototype of COCONUT and experimentally demonstrate its correct behavior. We also show that its abstraction enables a more efficient implementation of seamless scale-out compared to a naive baseline. Finally, reasoning about network behavior requires a new model that enables us to distinguish between observable and unobservable events. So in the last part, we present the Input/Output Automaton (IOA) model and formalize networks' behaviors. Using this framework, we prove that COCONUT enables seamless scale out of networking elements, i.e., the user-perceived behavior of any COCONUT element implemented with a distributed set of concurrent replicas is provably indistinguishable from its singleton implementation

Illinois Digital Environment for Access to Learning and Scholarship Repository