Search CORE

1,932 research outputs found

Enabling Work-conserving Bandwidth Guarantees for Multi-tenant Datacenters via Dynamic Tenant-Queue Binding

Author: Chen Kai
Hu Shuihai
Hu Yih-Chun
Liu Zhuotao
Wang Yi
Wu Haitao
Zhang Gong
Publication venue
Publication date: 07/01/2018
Field of study

Today's cloud networks are shared among many tenants. Bandwidth guarantees and work conservation are two key properties to ensure predictable performance for tenant applications and high network utilization for providers. Despite significant efforts, very little prior work can really achieve both properties simultaneously even some of them claimed so. In this paper, we present QShare, an in-network based solution to achieve bandwidth guarantees and work conservation simultaneously. QShare leverages weighted fair queuing on commodity switches to slice network bandwidth for tenants, and solves the challenge of queue scarcity through balanced tenant placement and dynamic tenant-queue binding. QShare is readily implementable with existing switching chips. We have implemented a QShare prototype and evaluated it via both testbed experiments and simulations. Our results show that QShare ensures bandwidth guarantees while driving network utilization to over 91% even under unpredictable traffic demands.Comment: The initial work is published in IEEE INFOCOM 201

arXiv.org e-Print Archive

Crossref

CASPR: Judiciously Using the Cloud for Wide-Area Packet Recovery

Author: Byers John W.
Dogar Fahad R.
Doucette Cody
Haq Osama
Publication venue
Publication date: 01/01/2019
Field of study

We revisit a classic networking problem -- how to recover from lost packets in the best-effort Internet. We propose CASPR, a system that judiciously leverages the cloud to recover from lost or delayed packets. CASPR supplements and protects best-effort connections by sending a small number of coded packets along the highly reliable but expensive cloud paths. When receivers detect packet loss, they recover packets with the help of the nearby data center, not the sender, thus providing quick and reliable packet recovery for latency-sensitive applications. Using a prototype implementation and its deployment on the public cloud and the PlanetLab testbed, we quantify the benefits of CASPR in providing fast, cost effective packet recovery. Using controlled experiments, we also explore how these benefits translate into improvements up and down the network stack

arXiv.org e-Print Archive

Boston University Institutional Repository (OpenBU)

Fog-supported delay-constrained energy-saving live migration of VMs over multiPath TCP/IP 5G connections

Author: Baccarelli Enzo
Momenzadeh Alireza
Scarpiniti Michele
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

The incoming era of the fifth-generation fog computing-supported radio access networks (shortly, 5G FOGRANs) aims at exploiting computing/networking resource virtualization, in order to augment the limited resources of wireless devices through the seamless live migration of virtual machines (VMs) toward nearby fog data centers. For this purpose, the bandwidths of the multiple wireless network interface cards of the wireless devices may be aggregated under the control of the emerging MultiPathTCP (MPTCP) protocol. However, due to the fading and mobility-induced phenomena, the energy consumptions of the current state-of-the-art VM migration techniques may still offset their expected benefits. Motivated by these considerations, in this paper, we analytically characterize and implement in software and numerically test the optimal minimum-energy settable-complexity bandwidth manager (SCBM) for the live migration of VMs over 5G FOGRAN MPTCP connections. The key features of the proposed SCBM are that: 1) its implementation complexity is settable on-line on the basis of the target energy consumption versus implementation complexity tradeoff; 2) it minimizes the network energy consumed by the wireless device for sustaining the migration process under hard constraints on the tolerated migration times and downtimes; and 3) by leveraging a suitably designed adaptive mechanism, it is capable to quickly react to (possibly, unpredicted) fading and/or mobility-induced abrupt changes of the wireless environment without requiring forecasting. The actual effectiveness of the proposed SCBM is supported by extensive energy versus delay performance comparisons that cover: 1) a number of heterogeneous 3G/4G/WiFi FOGRAN scenarios; 2) synthetic and real-world workloads; and, 3) MPTCP and wireless connections

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Cloud-efficient modelling and simulation of magnetic nano materials

Author: Ivanović Pavle
Publication venue
Publication date: 01/01/2021
Field of study

Scientific simulations are rarely attempted in a cloud due to the substantial performance costs of virtualization. Considerable communication overheads, intolerable latencies, and inefficient hardware emulation are the main reasons why this emerging technology has not been fully exploited. On the other hand, the progress of computing infrastructure nowadays is strongly dependent on perspective storage medium development, where efficient micromagnetic simulations play a vital role in future memory design. This thesis addresses both these topics by merging micromagnetic simulations with the latest OpenStack cloud implementation while providing a time and costeffective alternative to expensive computing centers. However, many challenges have to be addressed before a high-performance cloud platform emerges as a solution for problems in micromagnetic research communities. First, the best solver candidate has to be selected and further improved, particularly in the parallelization and process communication domain. Second, a 3-level cloud communication hierarchy needs to be recognized and each segment adequately addressed. The required steps include breaking the VMisolation for the host’s shared memory activation, cloud network-stack tuning, optimization, and efficient communication hardware integration. The project work concludes with practical measurements and confirmation of successfully implemented simulation into an open-source cloud environment. It is achieved that the renewed Magpar solver runs for the first time in the OpenStack cloud by using ivshmem for shared memory communication. Also, extensive measurements proved the effectiveness of our solutions, yielding from sixty percent to over ten times better results than those achieved in the standard cloud.Aufgrund der erheblichen Leistungskosten der Virtualisierung werden wissenschaftliche Simulationen in einer Cloud selten versucht. Beträchtlicher Kommunikationsaufwand, erhebliche Latenzen und ineffiziente Hardwareemulation sind die Hauptgründe, warum diese aufkommende Technologie nicht vollständig genutzt wurde. Andererseits hängt der Fortschritt der Computertechnologie heutzutage stark von der Entwicklung perspektivischer Speichermedien ab, bei denen effiziente mikromagnetische Simulationen eine wichtige Rolle für die zukünftige Speichertechnologie spielen. Diese Arbeit befasst sich mit diesen beiden Themen, indem mikromagnetische Simulationen mit der neuesten OpenStack Cloud-Implementierung zusammengeführt werden, um eine zeit- und kostengünstige Alternative zu teuren Rechenzentren bereitzustellen. Viele Herausforderungen müssen jedoch angegangen werden, bevor eine leistungsstarke Cloud-Plattform als Lösung für Probleme in mikromagnetischen Forschungsgemeinschaften entsteht. Zunächst muss der beste Kandidat für die Lösung ausgewählt und weiter verbessert werden, insbesondere im Bereich der Parallelisierung und Prozesskommunikation. Zweitens muss eine 3-stufige CloudKommunikationshierarchie erkannt und jedes Segment angemessen adressiert werden. Die erforderlichen Schritte umfassen das Aufheben der VM-Isolation, um den gemeinsam genutzten Speicher zwischen Cloud-Instanzen zu aktivieren, die Optimierung des Cloud-Netzwerkstapels und die effiziente Integration von Kommunikationshardware. Die praktische Arbeit endet mit Messungen und der Bestätigung einer erfolgreich implementierten Simulation in einer Open-Source Cloud-Umgebung. Als Ergebnis haben wir erreicht, dass der neu erstellte Magpar-Solver zum ersten Mal in der OpenStack Cloud ausgeführt wird, indem ivshmem für die Shared-Memory Kommunikation verwendet wird. Umfangreiche Messungen haben auch die Wirksamkeit unserer Lösungen bewiesen und von sechzig Prozent bis zu zehnmal besseren Ergebnissen als in der Standard Cloud geführt

Publikationsserver der Technischen Universität Clausthal

Investigating Emerging Security Threats in Clouds and Data Centers

Author: Gao Xing
Publication venue: W&M ScholarWorks
Publication date: 09/07/2018
Field of study

Data centers have been growing rapidly in recent years to meet the surging demand of cloud services. However, the expanding scale of a data center also brings new security threats. This dissertation studies emerging security issues in clouds and data centers from different aspects, including low-level cooling infrastructures and different virtualization techniques such as container and virtual machine (VM). We first unveil a new vulnerability called reduced cooling redundancy that might be exploited to launch thermal attacks, resulting in severely worsened thermal conditions in a data center. Such a vulnerability is caused by the wide adoption of aggressive cooling energy saving policies. We conduct thermal measurements and uncover effective thermal attack vectors at the server, rack, and data center levels. We also present damage assessments of thermal attacks. Our results demonstrate that thermal attacks can negatively impact the thermal conditions and reliability of victim servers, significantly raise the cooling cost, and even lead to cooling failures. Finally, we propose effective defenses to mitigate thermal attacks. We then perform a systematic study to understand the security implications of the information leakage in multi-tenancy container cloud services. Due to the incomplete implementation of system resource isolation mechanisms in the Linux kernel, a spectrum of system-wide host information is exposed to the containers, including host-system state information and individual process execution information. By exploiting such leaked host information, malicious adversaries can easily launch advanced attacks that can seriously affect the reliability of cloud services. Additionally, we discuss the root causes of the containers\u27 information leakage and propose a two-stage defense approach. The experimental results show that our defense is effective and incurs trivial performance overhead. Finally, we investigate security issues in the existing VM live migration approaches, especially the post-copy approach. While the entire live migration process relies upon reliable TCP connectivity for the transfer of the VM state, we demonstrate that the loss of TCP reliability leads to VM live migration failure. By intentionally aborting the TCP connection, attackers can cause unrecoverable memory inconsistency for post-copy, significantly increase service downtime, and degrade the running VM\u27s performance. From the offensive side, we present detailed techniques to reset the migration connection under heavy networking traffic. From the defensive side, we also propose effective protection to secure the live migration procedure

College of William & Mary: W&M Publish

PABO: Mitigating Congestion via Packet Bounce in Data Center Networks

Author: Liu Zhiyong
Mühlhäuser Max
Shi Xiang
Wang Lin
Zhang Fa
Zheng Kai
Publication venue
Publication date: 03/08/2018
Field of study

In today's data center, a diverse mix of throughput-sensitive long flows and delay-sensitive short flows are commonly presented in shallow-buffered switches. Long flows could potentially block the transmission of delay-sensitive short flows, leading to degraded performance. Congestion can also be caused by the synchronization of multiple TCP connections for short flows, as typically seen in the partition/aggregate traffic pattern. While multiple end-to-end transport-layer solutions have been proposed, none of them have tackled the real challenge: reliable transmission in the network. In this paper, we fill this gap by presenting PABO -- a novel link-layer design that can mitigate congestion by temporarily bouncing packets to upstream switches. PABO's design fulfills the following goals: i) providing per-flow based flow control on the link layer, ii) handling transient congestion without the intervention of end devices, and iii) gradually back propagating the congestion signal to the source when the network is not capable to handle the congestion.Experiment results show that PABO can provide prominent advantage of mitigating transient congestions and can achieve significant gain on end-to-end delay

arXiv.org e-Print Archive

VU Research Portal

Optimizing the cloud? Don't train models. Build oracles!

Author: Ameli Siavash
Bang Tiemo
Crooks Natacha
Hellerstein Joseph M.
Power Conor
Publication venue
Publication date: 13/08/2023
Field of study

We propose cloud oracles, an alternative to machine learning for online optimization of cloud configurations. Our cloud oracle approach guarantees complete accuracy and explainability of decisions for problems that can be formulated as parametric convex optimizations. We give experimental evidence of this technique's efficacy and share a vision of research directions for expanding its applicability.Comment: Initial conference submission limited to 6 page

arXiv.org e-Print Archive