Search CORE

344 research outputs found

LazyCtrl: Scalable Network Control for Cloud Data Centers

Author: IEEE
Sun Y
Uhlig S
Wang L
Yang B
Zhang Y
Zheng K
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/07/2017
Field of study

Crossref

Queen Mary Research Online

LazyCtrl: A Scalable Hybrid Network Control Plane Design for Cloud Data Centers

Author: Sun Y
Uhlig S
Wang L
Yang B
Zheng K
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Crossref

Queen Mary Research Online

Multi-controller Based Software-Defined Networking: A Survey

Author: Baker T
Guo Z
Hu T
Lan J
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Software-Defined Networking (SDN) is a novel network paradigm that enables flexible management for networks. As the network size increases, the single centralized controller cannot meet the increasing demand for flow processing. Thus, the promising solution for SDN with large-scale networks is the multi-controller. In this paper, we present a compressive survey for multi-controller research in SDN. First, we introduce the overview of multi-controller, including the origin of multi-controller and its challenges. Then, we classify multi-controller research into four aspects (scalability, consistency, reliability, load balancing) depending on the process of implementing the multi-controller. Finally, we propose some relevant research issues to deal with in the future and conclude the multi-controller research

LJMU Research Online (Liverpool John Moores University)

Crossref

Adaptive and Scalable High Availability for Infrastructure Clouds

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Crossref

Improving Cloud Middlebox Infrastructure for Online Services

Author: Gandhi Rohan S
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2016
Field of study

Middleboxes are an indispensable part of the datacenter networks that provide high availability, scalability and performance to the online services. Using load balancer as an example, this thesis shows that the prevalent scale-out middlebox designs using commodity servers are plagued with three fundamental problems: (1) The server-based layer-4 middleboxes are costly and inflate round-trip-time as much as 2x by processing the packets in software. (2) The middlebox instances cause traffic detouring en route from sources to destinations, which inflates network bandwidth usage by as much as 3.2x and can cause transient congestion. (3) Additionally, existing cloud providers do not support layer-7 middleboxes as a service, and third-party proxy-based layer-7 middlebox design exhibits poor availability as TCP state stored locally on middlebox instances are lost upon instance failure. This thesis examines the root causes of the above problems and proposes new cloud-scale middlebox design principles that systemically address all three problems. First, to address the performance problem, we make a key observation that existing commodity switches have resources available to implement key layer-4 middlebox functionalities such as load balancer, and by processing packets in hardware, switches offer low latency and high capacity benefits, at no additional cost as the switch resources are idle. Motivated by this observation, we propose the design principle of using idle switch resources to accelerate middlebox functionailites. To demonstrate the principle, we developed the complete L4 load balancer design that uses commodity switches for low cost and high performance, and carefully fuses a few software load balancer instances to provide for high availability. Second, to address the high network overhead problem from traffic detouring through middlebox instances, we propose to exploit the principles of locality and flexibility in placing the middlebox instances and servers to handle the traffic closer to the sources and reduce the overall traffic and link utilization in the network. Third, to provide high availability in a layer 7 middleboxes, we propose a novel middlebox design principle of decoupling the TCP state from middlebox instances and storing it in persistent key-value store so that any middlebox instance can seamlessly take over any TCP connection when middlebox instances fail. We demonstrate the effectiveness of the above cloud-scale middlebox design principles using load balancers as an example. Specifically, we have prototyped the three design principles in three cloud-scale load balancers: Duet, Rubik, and Yoda, respectively. Our evaluation using a datacenter testbed and large scale simulations show that Duet lowers the costs by 12x and latency overhead by 1000x, Rubik further lowers the datacenter network traffic overhead by 3x, and Yoda L7 Load balancer-as-a-service is practical; decoupling TCP state from load balancer instances has a negligible

Purdue E-Pubs

State-preserving container orchestration in failover scenarios

Author: Schmidt Henri
Publication venue
Publication date: 01/02/2023
Field of study

Containers have been widely adopted for deployment of high availability applications and services. This adoption is in part due to the native support of fault tolerance mechanisms in container orchestration frameworks such as Kubernetes. While Kubernetes provides service replication as a fault tolerance mechanism for stateless applications, service replication does not satisfy requirements for stateful applications. Currently this shortcoming is addressed by data replication in databases. This requires a tight coupling and modification of the stateful application to support high availability. Thus, this thesis proposes a new Checkpoint/Restore (C/R) Kubernetes operator to achieve fault tolerance for stateful applications without any modification of the application. The operator takes a checkpoint in a configurable interval. In case of a fault a new application container is created automatically from the most recent checkpoint. We compare the proposed approach with a more conventional approach in which we pull and restore the application state from the application through an API. We measure the overhead of both methods, the service interruption and the recovery time in case of faults. We find the C/R Operator has similar performance in recovery time as the traditional approach, but does not need any application modification. The results signify C/R as a promising technology for a fault tolerance mechanism for stateful applications

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung