19 research outputs found
Understanding PCIe performance for end host networking
In recent years, spurred on by the development and availability of programmable NICs, end hosts have increasingly become the enforcement point for core network functions such as load balancing, congestion control, and application specific network offloads. However, implementing custom designs on programmable NICs is not easy: many potential bottlenecks can impact performance.
This paper focuses on the performance implication of PCIe, the de-facto I/O interconnect in contemporary servers, when interacting with the host architecture and device drivers. We present a theoretical model for PCIe and pcie-bench, an open-source suite, that allows developers to gain an accurate and deep understanding of the PCIe substrate. Using pcie-bench, we characterize the PCIe subsystem in modern servers. We highlight surprising differences in PCIe implementations, evaluate the undesirable impact of PCIe features such as IOMMUs, and show the practical limits for common network cards operating at 40Gb/s and beyond. Furthermore, through pcie-bench we gained insights which guided software and future hardware
architectures for both commercial and research oriented network cards
and DMA engines
Recommended from our members
A hybrid network architecture for modular data centers
The emergence of the mega data center has resulted in the basic building block of ever larger data centers changing from a rack comprising tens of servers to a self-contained modular shipping container that holds upto a thousand servers. These self-contained modular blocks include networking, power and cooling equipment besides servers. However, provisioning bandwidth between these containers at a large scale is still a significant challenge. Traditional approaches to provisioning bandwidth use electrical packet switches with a scale-up architecture and are often highly oversubscribed. More recent proposals such as those using clos networks promise full bisection bandwidth between servers, albeit at a high cost and power consumption. We present Helios, a hybrid architecture for modular data centers that combines electrical packet switching and optical circuit switching in a single network and dynamically provisions bandwidth between the modular containers on demand. We investigate this design from an architectural standpoint by building a fully functional prototype and explore its implications for data center networks and the challenges that it introduces. Our prototype shows the feasibility of building such a system and achieving high performance at considerably lower cost. Additionally it uncovers several issues that pose new challenges for designing large data center networks and problems that arise when circuits are rapidly reconfigure
A hybrid network architecture for modular data centers
The emergence of the mega data center has resulted in the basic building block of ever larger data centers changing from a rack comprising tens of servers to a self-contained modular shipping container that holds upto a thousand servers. These self-contained modular blocks include networking, power and cooling equipment besides servers. However, provisioning bandwidth between these containers at a large scale is still a significant challenge. Traditional approaches to provisioning bandwidth use electrical packet switches with a scale-up architecture and are often highly oversubscribed. More recent proposals such as those using clos networks promise full bisection bandwidth between servers, albeit at a high cost and power consumption. We present Helios, a hybrid architecture for modular data centers that combines electrical packet switching and optical circuit switching in a single network and dynamically provisions bandwidth between the modular containers on demand. We investigate this design from an architectural standpoint by building a fully functional prototype and explore its implications for data center networks and the challenges that it introduces. Our prototype shows the feasibility of building such a system and achieving high performance at considerably lower cost. Additionally it uncovers several issues that pose new challenges for designing large data center networks and problems that arise when circuits are rapidly reconfigure
Network Performance Improvements For Web Services : : An End-to-End View
Modern web services are complex systems with several components that impose stringent performance requirements on the network. The networking subsystem in turn consists of several pieces, such as the wide area and data center networks, different devices, and protocols involved in a user's interaction with a web service. In this dissertation we take a holistic view of the network and improve efficiency and functionality across the stack. We identify three important networking challenges faced by web services in the wide area network, the data center network, and the host network stack, and present solutions. First, web services are dominated by short TCP flows that terminate in as few as 2-3 round trips. Thus, an additional round trip for TCP's connection handshake adds a significant latency overhead. We present TCP Fast Open, a transport protocol enhancement, that enables safe data exchange during TCP's initial handshake, thereby reducing application network latency by a full round trip time. TCP Fast Open uses a security token to verify client IP address ownership, and mitigates the security considerations that arise from allowing data exchange during the handshake. TCP Fast Open is widely deployed and is available as part of the Linux Kernel since version 3.6. Second, provisioning network bandwidth for hundreds of thousands of servers in the data center is expensive. Traditional shortest path based routing protocols are unable to effectively utilize the underlying topology's capacity to maximize network utilization. We present Dahu, a commodity switch design targeted at data centers, that avoids congestion hot-spots by dynamically spreading traffic uniformly across links, and actively leverages non -shortest paths for traffic forwarding. Third, scalable rate limiting is an important primitive for managing server network resources in the data center. Unfortunately, software-based rate limiting suffers from limited accuracy and high CPU overhead at high link speeds, whereas current NICs only support few tens of hardware rate limiters. We present SENIC, a NIC design that natively supports tens of thousands of rate limiters -- 100x to 1000x the number available in NICs today -- to meet the needs of network performance isolation and congestion control in data center
Recommended from our members
NetShare: Virtualizing Data Center Networks across Services
Data centers lower costs by sharing the physical infrastructure
among multiple services. However, the data center network should also ideally
provide bandwidth guarantees to each service in a tunable manner while
maintaining high utilization. We describe {\em NetShare}, a new statistical
multiplexing mechanism for Data Center networks that does this without
requiring changes to existing routers. NetShare allows the bisection bandwidth
of the network to be allocated across services based on simple weights
specified by a manager. Bandwidth unused by a service is shared
proportionately by other services. More precisely, NetShare provides weighted
hierarchical max-min fair sharing, a generalization of hierarchical fair
queuing of individual {\em links}. We present three mechanisms to implement
NetShare including one that leverages TCP flows and requires no changes to
routers or servers. We show experiments using multiple Hadoop instances and a
network of Fulcrum switches and show that the instances can interfere without
NetShare and yet complete faster with NetShare when compared to the alternative
of static reservation.Pre-2018 CSE ID: CS2010-095
Recommended from our members
Dahu: Improved Data Center Multipath Forwarding
Solving "Big Data" problems requires bridging massive quantities of
compute, memory, and storage, which requires significant amounts of bisection
bandwidth. Topologies like Fat-tree, VL2, and HyperX achieve a scale-out design
by leveraging multiple paths from source to destination. However, traditional
routing protocols are not able to effectively utilize these links while also
harnessing spare network capacity to better statistically multiplex network
resources. In this work we present Dahu, a switch mechanism for efficiently
load balancing traffic in multipath networks. Dahu avoids congestion hot-spots
by dynamically spreading traffic uniformly across links, and forwarding traffic
over non-minimal paths where possible. By performing load balancing primarily
using local information, Dahu can act more quickly than centralized approaches,
and responds to failure gracefully. Our evaluation shows that Dahu delivers up
to 50% higher throughput relative to ECMP in an 8,192 server Fat-tree network
and up to 500% improvement in throughput in large scale HyperX networks with
over 130,000 servers.Pre-2018 CSE ID: CS2013-099
Hedera: Dynamic flow scheduling for data center networks
Today’s data centers offer tremendous aggregate bandwidth to clusters of tens of thousands of machines. However, because of limited port densities in even the highest-end switches, data center topologies typically consist of multi-rooted trees with many equal-cost paths between any given pair of hosts. Existing IP multipathing protocols usually rely on per-flow static hashing and can cause substantial bandwidth losses due to longterm collisions. In this paper, we present Hedera, a scalable, dynamic flow scheduling system that adaptively schedules a multi-stage switching fabric to efficiently utilize aggregate network resources. We describe our implementation using commodity switches and unmodified hosts, and show that for a simulated 8,192 host data center, Hedera delivers bisection bandwidth that is 96 % of optimal and up to 113 % better than static load-balancing methods.
Bullet Genetic Algorithms for Level Control in a Real Time Process
Measurement of level, temperature, pressure and flow parameters are very vital in all process industries. A combination of a few transducers with a controller, that forms a closed loop system leads to a stable and effective process. Level control of a spherical tank is a complex issue because of the non-linear nature of the tank. The model for such a real time process is identified and validated .The need for improved performance of the process has led to the development of optimal controllers. Genetic Algorithms (GA) is an evolutionary algorithm that is proposed for use in this respect. Determination or tuning of the Proportional-Integral (PI) parameters continues to be important as these parameters have a great influence on the stability and performance of the control system. The methodology and efficiency of proposed method are compared with that of Internal Model Control (IMC) and proves to be better