23 research outputs found
Survey of Consistent Network Updates
Computer networks have become a critical infrastructure. Designing dependable computer networks however is challenging, as such networks should not only meet strict requirements in terms of correctness, availability, and performance, but they should also be flexible enough to support fast updates, e.g., due to a change in the security policy, an increasing traffic demand, or a failure. The advent of Software-Defined Networks (SDNs) promises to provide such flexiblities, allowing to update networks in a fine-grained manner, also enabling a more online traffic engineering. In this paper, we present a structured survey of mechanisms and protocols to update computer networks in a fast and consistent manner. In particular, we identify and discuss the different desirable update consistency properties a network should provide, the algorithmic techniques which are needed to meet these consistency properties, their implications on the speed and costs at which updates can be performed. We also discuss the relationship of consistent network update problems to classic algorithmic optimization problems. While our survey is mainly motivated by the advent of Software-Defined Networks (SDNs), the fundamental underlying problems are not new, and we also provide a historical perspective of the subject
Software-Defined Networking: A Comprehensive Survey
peer reviewedThe Internet has led to the creation of a digital society, where (almost) everything is connected and is accessible from anywhere. However, despite their widespread adoption, traditional IP networks are complex and very hard to manage. It is both difficult to configure the network according to predefined policies, and to reconfigure it to respond to faults, load, and changes. To make matters even more difficult, current networks are also vertically integrated: the control and data planes are bundled together. Software-defined networking (SDN) is an emerging paradigm that promises to change this state of affairs, by breaking vertical integration, separating the network's control logic from the underlying routers and switches, promoting (logical) centralization of network control, and introducing the ability to program the network. The separation of concerns, introduced between the definition of network policies, their implementation in switching hardware, and the forwarding of traffic, is key to the desired flexibility: by breaking the network control problem into tractable pieces, SDN makes it easier to create and introduce new abstractions in networking, simplifying network management and facilitating network evolution. In this paper, we present a comprehensive survey on SDN. We start by introducing the motivation for SDN, explain its main concepts and how it differs from traditional networking, its roots, and the standardization activities regarding this novel paradigm. Next, we present the key building blocks of an SDN infrastructure using a bottom-up, layered approach. We provide an in-depth analysis of the hardware infrastructure, southbound and northbound application programming interfaces (APIs), network virtualization layers, network operating systems (SDN controllers), network programming languages, and network applications. We also look at cross-layer problems such as debugging and troubleshooting. In an effort to anticipate the future evolution of this - ew paradigm, we discuss the main ongoing research efforts and challenges of SDN. In particular, we address the design of switches and control platforms—with a focus on aspects such as resiliency, scalability, performance, security, and dependability—as well as new opportunities for carrier transport networks and cloud providers. Last but not least, we analyze the position of SDN as a key enabler of a software-defined environment
Recommended from our members
Abstractions and optimisations for model-checking software-defined networks
Software-Defined Networking introduces a new programmatic abstraction layer by shifting the distributed network functions (NFs) from silicon chips (ASICs) to a logically centralized (controller) program. And yet, controller programs are a common source of bugs that can cause performance degradation, security exploits and poor reliability in networks. Assuring that a controller program satisfies the specifications is thus most preferable, yet the size of the network and the complexity of the controller makes this a challenging effort.
This thesis presents a highly expressive, optimised SDN model, (code-named MoCS), that can be reasoned about and verified formally in an acceptable timeframe. In it, we introduce reusable abstractions that (i) come with a rich semantics, for capturing subtle real-world bugs that are hard to track down, and (ii) which are formally proved correct. In addition, MoCS deals with timeouts of flow table entries, thus supporting automatic state refresh (soft state) in the network. The optimisations are achieved by (1) contextually analysing the model for possible partial order reductions in view of the concrete control program, network topology and specification property in question, (2) pre-computing packet equivalence classes and (3) indexing packets and rules that exist in the model and bit-packing (compressing) them.
Each of these developments is demonstrated by a set of real-world controller programs that have been implemented in network topologies of varying size, and publicly released under an open-source license
Resource Orchestration in Softwarized Networks
Network softwarization is an emerging research area that is envisioned to revolutionize the way network infrastructure is designed, operated, and managed today. Contemporary telecommunication networks are going through a major transformation, and softwarization is recognized as a crucial enabler of this transformation by both academia and industry. Softwarization promises to overcome the current ossified state of Internet network architecture and evolve towards a more open, agile, flexible, and programmable networking paradigm that will reduce both capital and operational expenditures, cut-down time-to-market of new services, and create new revenue streams. Software-Defined Networking (SDN) and Network Function Virtualization (NFV) are two complementary networking technologies that have established themselves as the cornerstones of network softwarization. SDN decouples the control and data planes to provide enhanced programmability and faster innovation of networking technologies. It facilitates simplified network control, scalability, availability, flexibility, security, cost-reduction, autonomic management, and fine-grained control of network traffic. NFV utilizes virtualization technology to reduce dependency on underlying hardware by moving packet processing activities from proprietary hardware middleboxes to virtualized entities that can run on commodity hardware. Together SDN and NFV simplify network infrastructure by utilizing standardized and commodity hardware for both compute and networking; bringing the benefits of agility, economies of scale, and flexibility of data centers to networks.
Network softwarization provides the tools required to re-architect the current network infrastructure of the Internet. However, the effective application of these tools requires efficient utilization of networking resources in the softwarized environment. Innovative techniques and mechanisms are required for all aspects of network management and control. The overarching goal of this thesis is to address several key resource orchestration challenges in softwarized networks. The resource allocation and orchestration techniques presented in this thesis utilize the functionality provided by softwarization to reduce operational cost, improve resource utilization, ensure scalability, dynamically scale resource pools according to demand, and optimize energy utilization
Efficient Resource Allocation for Throughput Maximization in Next-Generation Networks
Software-Defined Networking (SDN) and Network Function Virtualization (NFV) have emerged as the foundation of the next-generation network architecture by introducing great flexibility and network automation capabilities, including automatic response to faults and load changes and programmatic provision of network resources and connections. It has been envisioned that the SDN- and NFV-based next-generation network architecture will play a critical role in providing network services to users, where the desired network services, including data transfer and policy enforcement, are fulfilled by allocating network resources using virtualization technologies. However, the disparity between ever-growing user demands and scarce network resources makes resource allocation exceptionally central to the performance of a network service, because only by effectively allocating these scarce resources can a network service provider satisfy users and maximize the gain from running the service.
In this thesis, we study efficient resource allocation for network throughput maximization in next-generation networks, while meeting user resource demands and Quality of Service (QoS) requirements, subject to network resource capacities. This however poses great challenges, namely, (1) how to maximize network throughput, considering that both SDN-enabled switches and links are capacitated, (2) how to maximize the network throughput while taking into account network function and QoS requirements of users, (3) how to dynamically scale and readjust resource allocation for user requests, and (4) how to provision a network service that can satisfy user reliability requirements.
To address these challenges, we provide a thorough study of network throughput maximization problems in the context of the next-generation network architecture, by formulating the problems as optimizations problems and developing novel optimization frameworks and algorithms for the problems. Specifically, this thesis makes the following contributions.
Firstly, we consider dynamic user request admissions where user requests arrive one by one and the knowledge of future request arrivals is not given as a priori. We develop a novel cost model that accurately captures the usage costs of network resources and propose online algorithms with provable performance guarantees.
Secondly, we study the problem of realizing user requests with network function requirements, with the objective of maximizing network throughput, while meeting user QoS requirements, subject to resource capacity constraints. For this problem, we develop two algorithms that strive for the trade-off between the accuracy/quality of a solution and the running time of obtaining the solution.
Thirdly, we investigate maximization of network throughput by dynamically scaling network resources while minimizing the overall operational cost of a network. We propose a unified framework for two types of resource scaling {--} vertical scaling and horizontal scaling. Through non-trivial reductions of the problem of concern into several classic problems, we propose an algorithm that has been empirically demonstrated to deliver near-optimal solutions.
Fourthly, we deal with the problem of reliability-aware provisioning of network resources for users, with the aim of maximizing network throughput. We devise an approximation algorithm with a logarithmic approximation ratio for the general case of this problem. We also develop constant-factor approximation and exact algorithm for two special cases of the problem, respectively. The formulated problem is a generalization of several classic optimization problems.
Finally, in addition to extensive theoretical analyses, we also evaluate the performance of proposed algorithms empirically through experimental simulations based on real and synthetic datasets. Experimental results show that the proposed algorithms significantly outperform existing algorithms
Software-defined datacenter network debugging
Software-defined Networking (SDN) enables flexible network management, but as networks
evolve to a large number of end-points with diverse network policies, higher
speed, and higher utilization, abstraction of networks by SDN makes monitoring and
debugging network problems increasingly harder and challenging. While some problems
impact packet processing in the data plane (e.g., congestion), some cause policy
deployment failures (e.g., hardware bugs); both create inconsistency between operator
intent and actual network behavior. Existing debugging tools are not sufficient to
accurately detect, localize, and understand the root cause of problems observed in a
large-scale networks; either they lack in-network resources (compute, memory, or/and
network bandwidth) or take long time for debugging network problems.
This thesis presents three debugging tools: PathDump, SwitchPointer, and Scout,
and a technique for tracing packet trajectories called CherryPick. We call for a different
approach to network monitoring and debugging: in contrast to implementing
debugging functionality entirely in-network, we should carefully partition the debugging
tasks between end-hosts and network elements. Towards this direction, we present
CherryPick, PathDump, and SwitchPointer. The core of CherryPick is to cherry-pick the
links that are key to representing an end-to-end path of a packet, and to embed picked
linkIDs into its header on its way to destination.
PathDump is an end-host based network debugger based on tracing packet trajectories,
and exploits resources at the end-hosts to implement various monitoring and
debugging functionalities. PathDump currently runs over a real network comprising
only of commodity hardware, and yet, can support surprisingly a large class of network
debugging problems with minimal in-network functionality.
The key contributions of SwitchPointer is to efficiently provide network visibility
to end-host based network debuggers like PathDump by using switch memory as a
"directory service" — each switch, rather than storing telemetry data necessary for
debugging functionalities, stores pointers to end hosts where relevant telemetry data is
stored. The key design choice of thinking about memory as a directory service allows
to solve performance problems that were hard or infeasible with existing designs.
Finally, we present and solve a network policy fault localization problem that arises
in operating policy management frameworks for a production network. We develop
Scout, a fully-automated system that localizes faults in a large scale policy deployment
and further pin-points the physical-level failures which are most likely cause for
observed faults
Recommended from our members
A new approach to detecting failures in distributed systems
textFault-tolerant distributed systems often handle failures in two steps: first, detect the failure and, second, take some recovery action. A common approach to detecting failures is end-to-end timeouts, but using timeouts brings problems. First, timeouts are inaccurate: just because a process is unresponsive does not mean that process has failed. Second, choosing a timeout is hard: short timeouts can exacerbate the problem of inaccuracy, and long timeouts can make the system wait unnecessarily. In fact, a good timeout value—one that balances the choice between accuracy and speed—may not even exist, owing to the variance in a system’s end-to-end delays. ƃis dissertation posits a new approach to detecting failures in distributed systems: use information about failures that is local to each component, e.g., the contents of an OS’s process table. We call such information inside information, and use it as the basis in the design and implementation of three failure reporting services for data center applications, which we call Falcon, Albatross, and Pigeon. Falcon deploys a network of software modules to gather inside information in the system, and it guarantees that it never reports a working process as crashed by sometimes terminating unresponsive components. ƃis choice helps applications by making reports of failure reliable, meaning that applications can treat them as ground truth. Unfortunately, Falcon cannot handle network failures because guaranteeing that a process has crashed requires network communication; we address this problem in Albatross and Pigeon. Instead of killing, Albatross blocks suspected processes from using the network, allowing applications to make progress during network partitions. Pigeon renounces interference altogether, and reports inside information to applications directly and with more detail to help applications make better recovery decisions. By using these services, applications can improve their recovery from failures both quantitatively and qualitatively. Quantitatively, these services reduce detection time by one to two orders of magnitude over the end-to-end timeouts commonly used by data center applications, thereby reducing the unavailability caused by failures. Qualitatively, these services provide more specific information about failures, which can reduce the logic required for recovery and can help applications better decide when recovery is not necessary.Computer Science