320 research outputs found

    Runtime Verification of P4 Switches with Reinforcement Learning

    Get PDF
    © ACM 2019. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in Proceedings of the 2019 Workshop on Network Meets AI & ML - NetAI’19, http://dx.doi.org/10.1145/3341216.3342206.We present the design and early implementation of p4rl, a system that uses reinforcement learning-guided fuzz testing to execute the verification of P4 switches automatically at runtime. p4rl system uses our novel user-friendly query language, p4q to conveniently specify the intended properties in simple conditional statements (if-else) and check the actual runtime behavior of the P4 switch against such properties. In p4rl, user-specified p4q queries with the control plane configuration, Agent, and the Reward System guide the fuzzing process to trigger runtime bugs automatically during Agent training. To illustrate the strength of p4rl, we developed and evaluated an early prototype of p4rl system that executes runtime verification of a P4 network device, e.g., L3 (Layer-3) switch. Our initial results are promising and show that p4rl automatically detects diverse bugs while outperforming the baseline approach.EC/H2020/679158/EU/Resolving the Tussle in the Internet: Mapping, Architecture, and Policy Making/ResolutioNetBMBF, 01IS17052, Adaptiver, Virtueller Assistent zur LAwinenwarnung Nach CHarakter Eigenschaften (AVALANCHE

    Towards Runtime Verification of Programmable Switches

    No full text
    Is it possible to patch software bugs in P4 programs without human involvement? We show that this is partially possible in many cases due to advances in software testing and the structure of P4 programs. Our insight is that runtime verification can detect bugs, even those that are not detected at compile-time, with machine learning-guided fuzzing. This enables a more automated and real-time localization of bugs in P4 programs using software testing techniques like Tarantula. Once the bug in a P4 program is localized, the faulty code can be patched due to the programmable nature of P4. In addition, platform-dependent bugs can be detected. From P4_14 to P4_16 (latest version), our observation is that as the programmable blocks increase, the patchability of P4 programs increases accordingly. To this end, we design, develop, and evaluate P6 that (a) detects, (b) localizes, and (c) patches bugs in P4 programs with minimal human interaction. P6 tests P4 switch non-intrusively, i.e., requires no modification to the P4 program for detecting and localizing bugs. We used a P6 prototype to detect and patch seven existing bugs in eight publicly available P4 application programs deployed on two different switch platforms: behavioral model (bmv2) and Tofino. Our evaluation shows that P6 significantly outperforms bug detection baselines while generating fewer packets and patches bugs in P4 programs such as switch.p4 without triggering any regressions

    A Survey on Data Plane Programming with P4: Fundamentals, Advances, and Applied Research

    Full text link
    With traditional networking, users can configure control plane protocols to match the specific network configuration, but without the ability to fundamentally change the underlying algorithms. With SDN, the users may provide their own control plane, that can control network devices through their data plane APIs. Programmable data planes allow users to define their own data plane algorithms for network devices including appropriate data plane APIs which may be leveraged by user-defined SDN control. Thus, programmable data planes and SDN offer great flexibility for network customization, be it for specialized, commercial appliances, e.g., in 5G or data center networks, or for rapid prototyping in industrial and academic research. Programming protocol-independent packet processors (P4) has emerged as the currently most widespread abstraction, programming language, and concept for data plane programming. It is developed and standardized by an open community and it is supported by various software and hardware platforms. In this paper, we survey the literature from 2015 to 2020 on data plane programming with P4. Our survey covers 497 references of which 367 are scientific publications. We organize our work into two parts. In the first part, we give an overview of data plane programming models, the programming language, architectures, compilers, targets, and data plane APIs. We also consider research efforts to advance P4 technology. In the second part, we analyze a large body of literature considering P4-based applied research. We categorize 241 research papers into different application domains, summarize their contributions, and extract prototypes, target platforms, and source code availability.Comment: Submitted to IEEE Communications Surveys and Tutorials (COMS) on 2021-01-2

    TEL: Low-Latency Failover Traffic Engineering in Data Plane

    Full text link
    Modern network applications demand low-latency traffic engineering in the presence of network failure while preserving the quality of service constraints like delay and capacity. Fast Re-Route (FRR) mechanisms are widely used for traffic re-routing purposes in failure scenarios. Control plane FRR typically computes the backup forwarding rules to detour the traffic in the data plane when the failure occurs. This mechanism could be computed in the data plane with the emergence of programmable data planes. In this paper, we propose a system (called TEL) that contains two FRR mechanisms, namely, TEL-C and TEL-D. The first one computes backup forwarding rules in the control plane, satisfying max-min fair allocation. The second mechanism provides FRR in the data plane. Both algorithms require minimal memory on programmable data planes and are well-suited with modern line rate match-action forwarding architectures (e.g., PISA). We implement both mechanisms on P4 programmable software switches (e.g., BMv2 and Tofino) and measure their performance on various topologies. The obtained results from a datacenter topology show that our FRR mechanism can improve the flow completion time up to 4.6x-7.3x (i.e., small flows) and 3.1x-12x (i.e., large flows) compared to recirculation-based mechanisms, such as F10, respectively

    VERMONT : an In-band telemetry-based approach for live network property verification

    Get PDF
    The verification of network properties is often an exhaustive and time-consuming effort. The number of configurations needed to be analyzed by static verification increases as the networks grow larger, and the processing time consumed becomes prohibitive. Equally important, existing approaches fall short of detecting violations in dynamic environments. While the field of static verification has received significant attention in the last few years, few research efforts have been made to verify networks in production time. Capitalizing on the emergence of programmable data planes, in this thesis, we propose VERMONT, an In-Band Network Telemetry verification approach that continuously verifies network properties as the state of the networks changes. The key contribution of our work is an in-network system capable of continuously collecting the metadata from the network to verify properties in real-time. By efficiently retrieving only the necessary information from the network, VERMONT can accurately and quickly reason whether a set of proper ties is being held or not at a given time within the network. We implemented VERMONT, evaluated its performance using realistic settings, and compared it with a state-of-the-art approach. The results show that the proposed solution is technically feasible and performs at least one order of magnitude faster than a static verification counterpart. We also pro vide evidence that VERMONT incurs a very low resource usage footprint considering its application in several real-world networks.A verificação de propriedades de rede geralmente representa um esforço exaustivo e de morado. O número de configurações que precisam ser analisadas pela verificação estática aumenta à medida que as redes crescem, e o tempo de processamento consumido torna se proibitivo. Igualmente importante, as abordagens existentes não conseguem detectar violações em ambientes dinâmicos. Embora o campo de verificação estática tenha rece bido atenção significativa nos últimos anos, poucos esforços de pesquisa foram feitos para verificar redes em tempo de “execução”. Aproveitando o surgimento de planos de dados programáveis, nesta dissertação propomos VERMONT, uma abordagem de verificação ba seada em telemetria de rede in-band que verifica continuamente propriedades à medida que o estado da rede muda. A principal contribuição do trabalho é um sistema capaz de coletar, continuamente, metadados da rede para verificar as propriedades em tempo real. Ao recuperar com eficiência apenas as informações necessárias, VERMONT pode “racio cinar” com precisão e rapidez se um conjunto de propriedades está sendo satisfeito ou não em um determinado momento. Implementamos VERMONT, avaliamos seu desempenho usando configurações realistas e a comparamos com uma abordagem de última geração. Os resultados mostram que a solução proposta é tecnicamente viável e executa pelo me nos uma ordem de grandeza mais rápido do que uma contraparte de verificação estática. Também fornecemos evidências de que VERMONT incorre em uma utilização de recursos muito baixa, considerando sua aplicação em várias redes do mundo real

    Component-based error detection of P4 programs

    Get PDF
    P4 is a domain-specific language to develop the packet processing of network devices. These programs can easily hide errors, therefore we give a solution to analyze them and detect predefined errors in them. This paper shows the idea, which works with the P4 code as a set of components and processes them one by one, while calculating their pre- and postconditions. This method does not only detect errors between the components and their connections, but it is capable to reveal errors, which are hidden in the middle of a component. The paper introduces the method and shows its calculation in an example

    NFV Platforms: Taxonomy, Design Choices and Future Challenges

    Get PDF
    Due to the intrinsically inefficient service provisioning in traditional networks, Network Function Virtualization (NFV) keeps gaining attention from both industry and academia. By replacing the purpose-built, expensive, proprietary network equipment with software network functions consolidated on commodity hardware, NFV envisions a shift towards a more agile and open service provisioning paradigm. During the last few years, a large number of NFV platforms have been implemented in production environments that typically face critical challenges, including the development, deployment, and management of Virtual Network Functions (VNFs). Nonetheless, just like any complex system, such platforms commonly consist of abounding software and hardware components and usually incorporate disparate design choices based on distinct motivations or use cases. This broad collection of convoluted alternatives makes it extremely arduous for network operators to make proper choices. Although numerous efforts have been devoted to investigating different aspects of NFV, none of them specifically focused on NFV platforms or attempted to explore their design space. In this paper, we present a comprehensive survey on the NFV platform design. Our study solely targets existing NFV platform implementations. We begin with a top-down architectural view of the standard reference NFV platform and present our taxonomy of existing NFV platforms based on what features they provide in terms of a typical network function life cycle. Then we thoroughly explore the design space and elaborate on the implementation choices each platform opts for. We also envision future challenges for NFV platform design in the incoming 5G era. We believe that our study gives a detailed guideline for network operators or service providers to choose the most appropriate NFV platform based on their respective requirements. Our work also provides guidelines for implementing new NFV platforms

    Online learning on the programmable dataplane

    Get PDF
    This thesis makes the case for managing computer networks with datadriven methods automated statistical inference and control based on measurement data and runtime observations—and argues for their tight integration with programmable dataplane hardware to make management decisions faster and from more precise data. Optimisation, defence, and measurement of networked infrastructure are each challenging tasks in their own right, which are currently dominated by the use of hand-crafted heuristic methods. These become harder to reason about and deploy as networks scale in rates and number of forwarding elements, but their design requires expert knowledge and care around unexpected protocol interactions. This makes tailored, per-deployment or -workload solutions infeasible to develop. Recent advances in machine learning offer capable function approximation and closed-loop control which suit many of these tasks. New, programmable dataplane hardware enables more agility in the network— runtime reprogrammability, precise traffic measurement, and low latency on-path processing. The synthesis of these two developments allows complex decisions to be made on previously unusable state, and made quicker by offloading inference to the network. To justify this argument, I advance the state of the art in data-driven defence of networks, novel dataplane-friendly online reinforcement learning algorithms, and in-network data reduction to allow classification of switchscale data. Each requires co-design aware of the network, and of the failure modes of systems and carried traffic. To make online learning possible in the dataplane, I use fixed-point arithmetic and modify classical (non-neural) approaches to take advantage of the SmartNIC compute model and make use of rich device local state. I show that data-driven solutions still require great care to correctly design, but with the right domain expertise they can improve on pathological cases in DDoS defence, such as protecting legitimate UDP traffic. In-network aggregation to histograms is shown to enable accurate classification from fine temporal effects, and allows hosts to scale such classification to far larger flow counts and traffic volume. Moving reinforcement learning to the dataplane is shown to offer substantial benefits to stateaction latency and online learning throughput versus host machines; allowing policies to react faster to fine-grained network events. The dataplane environment is key in making reactive online learning feasible—to port further algorithms and learnt functions, I collate and analyse the strengths of current and future hardware designs, as well as individual algorithms

    P4Testgen: An Extensible Test Oracle For P4

    Full text link
    We present P4Testgen, a test oracle for the P4-16 language that supports automatic generation of packet tests for any P4-programmable device. Given a P4 program and sufficient time, P4Testgen generates tests that cover every reachable statement in the input program. Each generated test consists of an input packet, control-plane configuration, and output packet(s), and can be executed in software or on hardware. Unlike prior work, P4Testgen is open source and extensible, making it a general resource for the community. P4Testgen not only covers the full P4-16 language specification, it also supports modeling the semantics of an entire packet-processing pipeline, including target-specific behaviors-i.e., whole-program semantics. Handling aspects of packet processing that lie outside of the official specification is critical for supporting real-world targets (e.g., switches, NICs, end host stacks). In addition, P4Testgen uses taint tracking and concolic execution to model complex externs (e.g., checksums and hash functions) that have been omitted by other tools, and ensures the generated tests are correct and deterministic. We have instantiated P4Testgen to build test oracles for the V1model, eBPF, and the Tofino (TNA and T2NA) architectures; each of these extensions only required effort commensurate with the complexity of the target. We validated the tests generated by P4Testgen by running them across the entire P4C program test suite as well as the Tofino programs supplied with Intel's P4 Studio. In just a few months using the tool, we discovered and confirmed 25 bugs in the mature, production toolchains for BMv2 and Tofino, and are conducting ongoing investigations into further faults uncovered by P4Testgen

    Advancing SDN from OpenFlow to P4: a survey

    Get PDF
    Software-defined Networking (SDN) marked the beginning of a new era in the field of networking by decoupling the control and forwarding processes through the OpenFlow protocol. The Next Generation SDN is defined by Open Interfaces and full programmability of the data plane. P4 is a domain-specific language that fulfills these requirements and has known wide adoption over recent years from Academia and Industry. This work is an extensive survey of the P4 language covering domains of application, a detailed overview of the language, and future directions
    corecore