1,054 research outputs found

    Datacenter Traffic Control: Understanding Techniques and Trade-offs

    Get PDF
    Datacenters provide cost-effective and flexible access to scalable compute and storage resources necessary for today's cloud computing needs. A typical datacenter is made up of thousands of servers connected with a large network and usually managed by one operator. To provide quality access to the variety of applications and services hosted on datacenters and maximize performance, it deems necessary to use datacenter networks effectively and efficiently. Datacenter traffic is often a mix of several classes with different priorities and requirements. This includes user-generated interactive traffic, traffic with deadlines, and long-running traffic. To this end, custom transport protocols and traffic management techniques have been developed to improve datacenter network performance. In this tutorial paper, we review the general architecture of datacenter networks, various topologies proposed for them, their traffic properties, general traffic control challenges in datacenters and general traffic control objectives. The purpose of this paper is to bring out the important characteristics of traffic control in datacenters and not to survey all existing solutions (as it is virtually impossible due to massive body of existing research). We hope to provide readers with a wide range of options and factors while considering a variety of traffic control mechanisms. We discuss various characteristics of datacenter traffic control including management schemes, transmission control, traffic shaping, prioritization, load balancing, multipathing, and traffic scheduling. Next, we point to several open challenges as well as new and interesting networking paradigms. At the end of this paper, we briefly review inter-datacenter networks that connect geographically dispersed datacenters which have been receiving increasing attention recently and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial

    Reducing Internet Latency : A Survey of Techniques and their Merit

    Get PDF
    Bob Briscoe, Anna Brunstrom, Andreas Petlund, David Hayes, David Ros, Ing-Jyh Tsang, Stein Gjessing, Gorry Fairhurst, Carsten Griwodz, Michael WelzlPeer reviewedPreprin

    Experience-driven Control For Networking And Computing

    Get PDF
    Modern networking and computing systems have become very complicated and highly dynamic, which makes them hard to model, predict and control. In this thesis, we aim to study system control problems from a whole new perspective by leveraging emerging Deep Reinforcement Learning (DRL), to develop experience-driven model-free approaches, which enable a network or a device to learn the best way to control itself from its own experience (e.g., runtime statistics data) rather than from accurate mathematical models, just as a human learns a new skill (e.g., driving, swimming, etc). To demonstrate the feasibility and superiority of this experience-driven control design philosophy, we present the design, implementation, and evaluation of multiple DRL-based control frameworks on two fundamental networking problems, Traffic Engineering (TE) and Multi-Path TCP (MPTCP) congestion control, as well as one cutting-edge application, resource co-scheduling for Deep Neural Network (DNN) models on mobile and edge devices with heterogeneous hardware. We first propose DRL-TE, a DRL-based framework that enables experience-driven networking for TE. DRL-TE maximizes a widely-used utility function by jointly learning network environment and its dynamics, and making decisions under the guidance of powerful DNNs. We propose two new techniques, TE-aware exploration and actor-critic-based prioritized experience replay, to optimize the general DRL framework particularly for TE. Furthermore, we propose an Actor-Critic-based Transfer learning framework for TE, ACT-TE, which solves a practical problem in experience-driven networking: when network configurations are changed, how to train a new DRL agent to effectively and quickly adapt to the new environment. In the new network environment, ACT-TE leverages policy distillation to rapidly learn a new control policy from both old knowledge (i.e., distilled from the existing agent) and new experience (i.e., newly collected samples). In addition, we propose DRL-CC to enable experience-driven congestion control for MPTCP. DRL-CC utilizes a single (instead of multiple independent) DRL agent to dynamically and jointly perform congestion control for all active MPTCP flows on an end host with the objective of maximizing the overall utility. The novelty of our design is to utilize a flexible recurrent neural network, LSTM, under a DRL framework for learning a representation for all active flows and dealing with their dynamics. Moreover, we integrate the above LSTM-based representation network into an actor-critic framework for continuous congestion control, which applies the deterministic policy gradient method to train actor, critic, and LSTM networks in an end-to-end manner. With the emergence of more and more powerful chipsets and hardware and the rise of Artificial Intelligence of Things (AIoT), there is a growing trend for bringing DNN models to empower mobile and edge devices with intelligence such that they can support attractive AI applications on the edge in a real-time or near real-time manner. To leverage heterogeneous computational resources (such as CPU, GPU, DSP, etc) to effectively and efficiently support concurrent inference of multiple DNN models on a mobile or edge device, in the last part of this thesis, we propose a novel experience-driven control framework for resource co-scheduling, which we call COSREL. COSREL has the following desirable features: 1) it achieves significant speedup over commonly-used methods by efficiently utilizing all the computational resources on heterogeneous hardware; 2) it leverages DRL to make dynamic and wise online scheduling decisions based on system runtime state; 3) it is capable of making a good tradeoff among inference latency, throughput and energy efficiency; and 4) it makes no changes to given DNN models, thus preserves their accuracies. To validate and evaluate the proposed frameworks, we conduct extensive experiments on packet-level simulation (for TE), testbed with modified Linux kernel (for MPTCP), and off-the-shelf Android devices (for resource co-scheduling). The results well justify the effectiveness of these frameworks, as well as their superiority over several baseline methods

    Enhancing QUIC over Satellite Networks

    Get PDF
    The use of Satellite Communication (SATCOM) networks for broadband connectivity has recently seen an increase in popularity due to, among other factors, the rise of the latest generations of cellular networks (5G/6G) and the deployment of high-throughput satellites. In parallel, major advances have been witnessed in the context of the transport layer: first, the standardization and early deployment of QUIC, a new-generation and general-purpose transport protocol; and second, modern congestion control proposals such as the Bottleneck Bandwidth and Round-trip propagation time (BBR) algorithm. Even though satellite links introduce several challenges for transport layer mechanisms, mainly due to their long propagation delay, satellite Internet providers have relied on TCP connection-splitting solutions implemented by Performance-Enhancing Proxies (PEPs) to greatly overcome many of these challenges. However, due to QUIC's fully encrypted nature, these performance-boosting solutions become nearly impossible for QUIC traffic, leaving it in great disadvantage when competing against TCP-PEP. In this context, IETF QUIC WG contributors are currently investigating this matter and suggesting new solutions that can help improve QUIC's performance over SATCOM. This thesis aims to study some of these proposals and evaluate them through experimentation using a real network testbed and an emulated satellite link

    Enhancing Networks via Virtualized Network Functions

    Get PDF
    University of Minnesota Ph.D. dissertation. May 2019. Major: Computer Science. Advisor: Zhi-Li Zhang. 1 computer file (PDF); xii, 116 pages.In an era of ubiquitous connectivity, various new applications, network protocols, and online services (e.g., cloud services, distributed machine learning, cryptocurrency) have been constantly creating, underpinning many of our daily activities. Emerging demands for networks have led to growing traffic volume and complexity of modern networks, which heavily rely on a wide spectrum of specialized network functions (e.g., Firewall, Load Balancer) for performance, security, etc. Although (virtual) network functions (VNFs) are widely deployed in networks, they are instantiated in an uncoordinated manner failing to meet growing demands of evolving networks. In this dissertation, we argue that networks equipped with VNFs can be designed in a fashion similar to how computer software is today programmed. By following the blueprint of joint design over VNFs, networks can be made more effective and efficient. We begin by presenting Durga, a system fusing wide area network (WAN) virtualization on gateway with local area network (LAN) virtualization technology. It seamlessly aggregates multiple WAN links into a (virtual) big pipe for better utilizing WAN links and also provides fast fail-over thus minimizing application performance degradation under WAN link failures. Without the support from LAN virtualization technology, existing solutions fail to provide high reliability and performance required by today’s enterprise applications. We then study a newly standardized protocol, Multipath TCP (MPTCP), adopted in Durga, showing the challenge of associating MPTCP subflows in network for the purpose of boosting throughput and enhancing security. Instead of designing a customized solution in every VNF to conquer this common challenge (making VNFs aware of MPTCP), we implement an online service named SAMPO to be readily integrated into VNFs. Following the same principle, we make an attempt to take consensus as a service in software-defined networks. We illustrate new network failure scenarios that are not explicitly handled by existing consensus algorithms such as Raft, thereby severely affecting their correct or efficient operations. Finally, we re-consider VNFs deployed in a network from the perspective of network administrators. A global view of deployed VNFs brings new opportunities for performance optimization over the network, and thus we explore parallelism in service function chains composing a sequence of VNFs that are typically traversed in-order by data flows

    Livenet: A low-latency video transport network for large-scale live streaming

    Get PDF
    Low-latency live streaming has imposed stringent latency requirements on video transport networks. In this paper we report on the design and operation of the Alibaba low-latency video transport network, LiveNet. LiveNet builds on a flat CDN overlay with a centralized controller for global optimization. As part of this, we present our design of the global routing computation and path assignment, as well as our fast data transmission architecture with fine-grained control of video frames. The performance results obtained from three years of operation demonstrate the effectiveness of LiveNet in improving CDN performance and QoE metrics. Compared with our prior state-of-The-Art hierarchical CDN deployment, LiveNet halves the CDN delay and ensures 98% of views do not experience stalls and that 95% can start playback within 1 second. We further report our experiences of running LiveNet over the last 3 years

    Systems and Methods for Measuring and Improving End-User Application Performance on Mobile Devices

    Full text link
    In today's rapidly growing smartphone society, the time users are spending on their smartphones is continuing to grow and mobile applications are becoming the primary medium for providing services and content to users. With such fast paced growth in smart-phone usage, cellular carriers and internet service providers continuously upgrade their infrastructure to the latest technologies and expand their capacities to improve the performance and reliability of their network and to satisfy exploding user demand for mobile data. On the other side of the spectrum, content providers and e-commerce companies adopt the latest protocols and techniques to provide smooth and feature-rich user experiences on their applications. To ensure a good quality of experience, monitoring how applications perform on users' devices is necessary. Often, network and content providers lack such visibility into the end-user application performance. In this dissertation, we demonstrate that having visibility into the end-user perceived performance, through system design for efficient and coordinated active and passive measurements of end-user application and network performance, is crucial for detecting, diagnosing, and addressing performance problems on mobile devices. My dissertation consists of three projects to support this statement. First, to provide such continuous monitoring on smartphones with constrained resources that operate in such a highly dynamic mobile environment, we devise efficient, adaptive, and coordinated systems, as a platform, for active and passive measurements of end-user performance. Second, using this platform and other passive data collection techniques, we conduct an in-depth user trial of mobile multipath to understand how Multipath TCP (MPTCP) performs in practice. Our measurement study reveals several limitations of MPTCP. Based on the insights gained from our measurement study, we propose two different schemes to address the identified limitations of MPTCP. Last, we show how to provide visibility into the end- user application performance for internet providers and in particular home WiFi routers by passively monitoring users' traffic and utilizing per-app models mapping various network quality of service (QoS) metrics to the application performance.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/146014/1/ashnik_1.pd
    corecore