1,932 research outputs found
Enabling Work-conserving Bandwidth Guarantees for Multi-tenant Datacenters via Dynamic Tenant-Queue Binding
Today's cloud networks are shared among many tenants. Bandwidth guarantees
and work conservation are two key properties to ensure predictable performance
for tenant applications and high network utilization for providers. Despite
significant efforts, very little prior work can really achieve both properties
simultaneously even some of them claimed so.
In this paper, we present QShare, an in-network based solution to achieve
bandwidth guarantees and work conservation simultaneously. QShare leverages
weighted fair queuing on commodity switches to slice network bandwidth for
tenants, and solves the challenge of queue scarcity through balanced tenant
placement and dynamic tenant-queue binding. QShare is readily implementable
with existing switching chips. We have implemented a QShare prototype and
evaluated it via both testbed experiments and simulations. Our results show
that QShare ensures bandwidth guarantees while driving network utilization to
over 91% even under unpredictable traffic demands.Comment: The initial work is published in IEEE INFOCOM 201
CASPR: Judiciously Using the Cloud for Wide-Area Packet Recovery
We revisit a classic networking problem -- how to recover from lost packets
in the best-effort Internet. We propose CASPR, a system that judiciously
leverages the cloud to recover from lost or delayed packets. CASPR supplements
and protects best-effort connections by sending a small number of coded packets
along the highly reliable but expensive cloud paths. When receivers detect
packet loss, they recover packets with the help of the nearby data center, not
the sender, thus providing quick and reliable packet recovery for
latency-sensitive applications. Using a prototype implementation and its
deployment on the public cloud and the PlanetLab testbed, we quantify the
benefits of CASPR in providing fast, cost effective packet recovery. Using
controlled experiments, we also explore how these benefits translate into
improvements up and down the network stack
Fog-supported delay-constrained energy-saving live migration of VMs over multiPath TCP/IP 5G connections
The incoming era of the fifth-generation fog computing-supported radio access networks (shortly, 5G FOGRANs) aims at exploiting computing/networking resource virtualization, in order to augment the limited resources of wireless devices through the seamless live migration of virtual machines (VMs) toward nearby fog data centers. For this purpose, the bandwidths of the multiple wireless network interface cards of the wireless devices may be aggregated under the control of the emerging MultiPathTCP (MPTCP) protocol. However, due to the fading and mobility-induced phenomena, the energy consumptions of the current state-of-the-art VM migration techniques may still offset their expected benefits. Motivated by these considerations, in this paper, we analytically characterize and implement in software and numerically test the optimal minimum-energy settable-complexity bandwidth manager (SCBM) for the live migration of VMs over 5G FOGRAN MPTCP connections. The key features of the proposed SCBM are that: 1) its implementation complexity is settable on-line on the basis of the target energy consumption versus implementation complexity tradeoff; 2) it minimizes the network energy consumed by the wireless device for sustaining the migration process under hard constraints on the tolerated migration times and downtimes; and 3) by leveraging a suitably designed adaptive mechanism, it is capable to quickly react to (possibly, unpredicted) fading and/or mobility-induced abrupt changes of the wireless environment without requiring forecasting. The actual effectiveness of the proposed SCBM is supported by extensive energy versus delay performance comparisons that cover: 1) a number of heterogeneous 3G/4G/WiFi FOGRAN scenarios; 2) synthetic and real-world workloads; and, 3) MPTCP and wireless connections
Cloud-efficient modelling and simulation of magnetic nano materials
Scientific simulations are rarely attempted in a cloud due to the substantial
performance costs of virtualization. Considerable communication overheads,
intolerable latencies, and inefficient hardware emulation are the main reasons why
this emerging technology has not been fully exploited. On the other hand, the
progress of computing infrastructure nowadays is strongly dependent on
perspective storage medium development, where efficient micromagnetic
simulations play a vital role in future memory design.
This thesis addresses both these topics by merging micromagnetic simulations
with the latest OpenStack cloud implementation while providing a time and costeffective alternative to expensive computing centers.
However, many challenges have to be addressed before a high-performance cloud
platform emerges as a solution for problems in micromagnetic research
communities. First, the best solver candidate has to be selected and further
improved, particularly in the parallelization and process communication domain.
Second, a 3-level cloud communication hierarchy needs to be recognized and
each segment adequately addressed. The required steps include breaking the VMisolation for the host’s shared memory activation, cloud network-stack tuning,
optimization, and efficient communication hardware integration.
The project work concludes with practical measurements and confirmation of
successfully implemented simulation into an open-source cloud environment. It is
achieved that the renewed Magpar solver runs for the first time in the OpenStack
cloud by using ivshmem for shared memory communication. Also, extensive
measurements proved the effectiveness of our solutions, yielding from sixty
percent to over ten times better results than those achieved in the standard cloud.Aufgrund der erheblichen Leistungskosten der Virtualisierung werden
wissenschaftliche Simulationen in einer Cloud selten versucht. Beträchtlicher
Kommunikationsaufwand, erhebliche Latenzen und ineffiziente
Hardwareemulation sind die HauptgrĂĽnde, warum diese aufkommende
Technologie nicht vollständig genutzt wurde. Andererseits hängt der Fortschritt der
Computertechnologie heutzutage stark von der Entwicklung perspektivischer
Speichermedien ab, bei denen effiziente mikromagnetische Simulationen eine
wichtige Rolle fĂĽr die zukĂĽnftige Speichertechnologie spielen.
Diese Arbeit befasst sich mit diesen beiden Themen, indem mikromagnetische
Simulationen mit der neuesten OpenStack Cloud-Implementierung
zusammengefĂĽhrt werden, um eine zeit- und kostengĂĽnstige Alternative zu teuren
Rechenzentren bereitzustellen.
Viele Herausforderungen mĂĽssen jedoch angegangen werden, bevor eine
leistungsstarke Cloud-Plattform als Lösung für Probleme in mikromagnetischen
Forschungsgemeinschaften entsteht. Zunächst muss der beste Kandidat für die
Lösung ausgewählt und weiter verbessert werden, insbesondere im Bereich der
Parallelisierung und Prozesskommunikation. Zweitens muss eine 3-stufige CloudKommunikationshierarchie erkannt und jedes Segment angemessen adressiert
werden. Die erforderlichen Schritte umfassen das Aufheben der VM-Isolation, um
den gemeinsam genutzten Speicher zwischen Cloud-Instanzen zu aktivieren, die
Optimierung des Cloud-Netzwerkstapels und die effiziente Integration von
Kommunikationshardware.
Die praktische Arbeit endet mit Messungen und der Bestätigung einer erfolgreich
implementierten Simulation in einer Open-Source Cloud-Umgebung. Als Ergebnis
haben wir erreicht, dass der neu erstellte Magpar-Solver zum ersten Mal in der
OpenStack Cloud ausgefĂĽhrt wird, indem ivshmem fĂĽr die Shared-Memory
Kommunikation verwendet wird. Umfangreiche Messungen haben auch die
Wirksamkeit unserer Lösungen bewiesen und von sechzig Prozent bis zu zehnmal
besseren Ergebnissen als in der Standard Cloud gefĂĽhrt
Investigating Emerging Security Threats in Clouds and Data Centers
Data centers have been growing rapidly in recent years to meet the surging demand of cloud services. However, the expanding scale of a data center also brings new security threats. This dissertation studies emerging security issues in clouds and data centers from different aspects, including low-level cooling infrastructures and different virtualization techniques such as container and virtual machine (VM). We first unveil a new vulnerability called reduced cooling redundancy that might be exploited to launch thermal attacks, resulting in severely worsened thermal conditions in a data center. Such a vulnerability is caused by the wide adoption of aggressive cooling energy saving policies. We conduct thermal measurements and uncover effective thermal attack vectors at the server, rack, and data center levels. We also present damage assessments of thermal attacks. Our results demonstrate that thermal attacks can negatively impact the thermal conditions and reliability of victim servers, significantly raise the cooling cost, and even lead to cooling failures. Finally, we propose effective defenses to mitigate thermal attacks. We then perform a systematic study to understand the security implications of the information leakage in multi-tenancy container cloud services. Due to the incomplete implementation of system resource isolation mechanisms in the Linux kernel, a spectrum of system-wide host information is exposed to the containers, including host-system state information and individual process execution information. By exploiting such leaked host information, malicious adversaries can easily launch advanced attacks that can seriously affect the reliability of cloud services. Additionally, we discuss the root causes of the containers\u27 information leakage and propose a two-stage defense approach. The experimental results show that our defense is effective and incurs trivial performance overhead. Finally, we investigate security issues in the existing VM live migration approaches, especially the post-copy approach. While the entire live migration process relies upon reliable TCP connectivity for the transfer of the VM state, we demonstrate that the loss of TCP reliability leads to VM live migration failure. By intentionally aborting the TCP connection, attackers can cause unrecoverable memory inconsistency for post-copy, significantly increase service downtime, and degrade the running VM\u27s performance. From the offensive side, we present detailed techniques to reset the migration connection under heavy networking traffic. From the defensive side, we also propose effective protection to secure the live migration procedure
PABO: Mitigating Congestion via Packet Bounce in Data Center Networks
In today's data center, a diverse mix of throughput-sensitive long flows and
delay-sensitive short flows are commonly presented in shallow-buffered
switches. Long flows could potentially block the transmission of
delay-sensitive short flows, leading to degraded performance. Congestion can
also be caused by the synchronization of multiple TCP connections for short
flows, as typically seen in the partition/aggregate traffic pattern. While
multiple end-to-end transport-layer solutions have been proposed, none of them
have tackled the real challenge: reliable transmission in the network. In this
paper, we fill this gap by presenting PABO -- a novel link-layer design that
can mitigate congestion by temporarily bouncing packets to upstream switches.
PABO's design fulfills the following goals: i) providing per-flow based flow
control on the link layer, ii) handling transient congestion without the
intervention of end devices, and iii) gradually back propagating the congestion
signal to the source when the network is not capable to handle the
congestion.Experiment results show that PABO can provide prominent advantage of
mitigating transient congestions and can achieve significant gain on end-to-end
delay
Optimizing the cloud? Don't train models. Build oracles!
We propose cloud oracles, an alternative to machine learning for online
optimization of cloud configurations. Our cloud oracle approach guarantees
complete accuracy and explainability of decisions for problems that can be
formulated as parametric convex optimizations. We give experimental evidence of
this technique's efficacy and share a vision of research directions for
expanding its applicability.Comment: Initial conference submission limited to 6 page
- …