Search CORE

1,601 research outputs found

Enabling Work-conserving Bandwidth Guarantees for Multi-tenant Datacenters via Dynamic Tenant-Queue Binding

Author: Chen Kai
Hu Shuihai
Hu Yih-Chun
Liu Zhuotao
Wang Yi
Wu Haitao
Zhang Gong
Publication venue
Publication date: 07/01/2018
Field of study

Today's cloud networks are shared among many tenants. Bandwidth guarantees and work conservation are two key properties to ensure predictable performance for tenant applications and high network utilization for providers. Despite significant efforts, very little prior work can really achieve both properties simultaneously even some of them claimed so. In this paper, we present QShare, an in-network based solution to achieve bandwidth guarantees and work conservation simultaneously. QShare leverages weighted fair queuing on commodity switches to slice network bandwidth for tenants, and solves the challenge of queue scarcity through balanced tenant placement and dynamic tenant-queue binding. QShare is readily implementable with existing switching chips. We have implemented a QShare prototype and evaluated it via both testbed experiments and simulations. Our results show that QShare ensures bandwidth guarantees while driving network utilization to over 91% even under unpredictable traffic demands.Comment: The initial work is published in IEEE INFOCOM 201

arXiv.org e-Print Archive

Crossref

End-to-end informed VM selection in compute clouds

Author: Bestavros Azer
Teixeira Mario
Publication venue: Computer Science Department, Boston University
Publication date: 10/11/2014
Field of study

The selection of resources, particularly VMs, in current public IaaS clouds is usually done in a blind fashion, as cloud users do not have much information about resource consumption by co-tenant third-party tasks. In particular, communication patterns can play a significant part in cloud application performance and responsiveness, specially in the case of novel latencysensitive applications, increasingly common in today’s clouds. Thus, herein we propose an end-to-end approach to the VM allocation problem using policies based uniquely on round-trip time measurements between VMs. Those become part of a userlevel ‘Recommender Service’ that receives VM allocation requests with certain network-related demands and matches them to a suitable subset of VMs available to the user within the cloud. We propose and implement end-to-end algorithms for VM selection that cover desirable profiles of communications between VMs in distributed applications in a cloud setting, such as profiles with prevailing pair-wise, hub-and-spokes, or clustered communication patterns between constituent VMs. We quantify the expected benefits from deploying our Recommender Service by comparing our informed VM allocation approaches to conventional, random allocation methods, based on real measurements of latencies between Amazon EC2 instances. We also show that our approach is completely independent from cloud architecture details, is adaptable to different types of applications and workloads, and is lightweight and transparent to cloud providers.This work is supported in part by the National Science Foundation under grant CNS-0963974

Boston University Institutional Repository (OpenBU)

Quantifying the latency benefits of near-edge and in-network FPGA acceleration

Author: Andrew Putnam
Fahmy Suhaib A
Kwon Hyoukjun
Neugebauer Rolf
Noa
Shreejith Shanker
Vipin Kizheppatt
Wenxiao
Yi Shanhe
Zhao
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 27/04/2020
Field of study

Transmitting data to cloud datacenters in distributed IoT applications introduces significant communication latency, but is often the only feasible solution when source nodes are computationally limited. To address latency concerns, cloudlets, in-network computing, and more capable edge nodes are all being explored as a way of moving processing capability towards the edge of the network. Hardware acceleration using Field Programmable Gate Arrays (FPGAs) is also seeing increased interest due to reduced computation latency and improved efficiency. This paper evaluates the the implications of these offloading approaches using a case study neural network based image classification application, quantifying both the computation and communication latency resulting from different platform choices. We consider communication latency including the ingestion of packets for processing on the target platform, showing that this varies significantly with the choice of platform. We demonstrate that emerging in-network accelerator approaches offer much improved and predictable performance as well as better scaling to support multiple data sources

Crossref

Warwick Research Archives Portal Repository