136 research outputs found

    Lightweight Implementation of Per-packet Service Protection in eBPF/XDP

    Full text link
    Deterministic communication means reliable packet forwarding with close to zero packet loss and bounded latency. Packet loss or delay above a threshold caused by, e.g., equipment failure or malfunction could be catastrophic for applications that require deterministic communication. To meet loss related targets, per-packet service protection has been introduced by deterministic communications standards; it is provided by Frame Replication and Elimination for Reliability (FRER) for Layer 2 Ethernet networks and by Packet Replication, Elimination, and Ordering Functions (PREOF) for Layer 3 IP/MPLS networks. We have implemented FRER with two conceptually different methods: (1) in eBPF/XDP as a lightweight software implementation; and (2) in userspace. We evaluate our XDP FRER via an experimental analysis and compare the two FRER implementations.Comment: Paper submission for the talk with same title on netdev 0x17 conference: https://netdevconf.info/0x17/sessions/talk/lightweight-implementation-of-per-packet-service-protection-in-ebpfxdp.htm

    Orchestrating Edge Computing Services with Efficient Data Planes

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Design Automation of Complex Hydromechanical Transmissions

    Full text link

    UPIR: Toward the Design of Unified Parallel Intermediate Representation for Parallel Programming Models

    Full text link
    The complexity of heterogeneous computing architectures, as well as the demand for productive and portable parallel application development, have driven the evolution of parallel programming models to become more comprehensive and complex than before. Enhancing the conventional compilation technologies and software infrastructure to be parallelism-aware has become one of the main goals of recent compiler development. In this paper, we propose the design of unified parallel intermediate representation (UPIR) for multiple parallel programming models and for enabling unified compiler transformation for the models. UPIR specifies three commonly used parallelism patterns (SPMD, data and task parallelism), data attributes and explicit data movement and memory management, and synchronization operations used in parallel programming. We demonstrate UPIR via a prototype implementation in the ROSE compiler for unifying IR for both OpenMP and OpenACC and in both C/C++ and Fortran, for unifying the transformation that lowers both OpenMP and OpenACC code to LLVM runtime, and for exporting UPIR to LLVM MLIR dialect.Comment: Typos corrected. Format update

    A Survey on Data Plane Programming with P4: Fundamentals, Advances, and Applied Research

    Full text link
    With traditional networking, users can configure control plane protocols to match the specific network configuration, but without the ability to fundamentally change the underlying algorithms. With SDN, the users may provide their own control plane, that can control network devices through their data plane APIs. Programmable data planes allow users to define their own data plane algorithms for network devices including appropriate data plane APIs which may be leveraged by user-defined SDN control. Thus, programmable data planes and SDN offer great flexibility for network customization, be it for specialized, commercial appliances, e.g., in 5G or data center networks, or for rapid prototyping in industrial and academic research. Programming protocol-independent packet processors (P4) has emerged as the currently most widespread abstraction, programming language, and concept for data plane programming. It is developed and standardized by an open community and it is supported by various software and hardware platforms. In this paper, we survey the literature from 2015 to 2020 on data plane programming with P4. Our survey covers 497 references of which 367 are scientific publications. We organize our work into two parts. In the first part, we give an overview of data plane programming models, the programming language, architectures, compilers, targets, and data plane APIs. We also consider research efforts to advance P4 technology. In the second part, we analyze a large body of literature considering P4-based applied research. We categorize 241 research papers into different application domains, summarize their contributions, and extract prototypes, target platforms, and source code availability.Comment: Submitted to IEEE Communications Surveys and Tutorials (COMS) on 2021-01-2

    Caladan: a distributed meta-OS for data center disaggregation

    Get PDF
    Data center resource disaggregation promises cost savings by pool-ing compute, storage and memory resources into separate, net-worked nodes. The benefits of this model are clear, but a closer lookshows that its full performance and efficiency potential cannot beeasily realized. Existing systems use CPUs pervasively to interface ar-bitrary devices with the network and to orchestrate communicationamong them, reducing the benefits of disaggregation.In this paper we presentCaladan, a novel system with a trusteduni-versal resource fabricthat interconnects all resources and efficientlyoffloads the system and application control planes to SmartNICs,freeing server CPUs to execute application logic. Caladan offersthree core services: capability-driven distributed name space, virtualdevices, and direct inter-device communications. These servicesare implemented in a trustedmeta-kernelthat executes in per-nodeSmartNICs. Low-level device drivers running on the commodity hostOS are used for setting up accelerators and I/O devices, and exposingthem to Caladan. Applications run in a distributed fashion acrossCPUs and multiple accelerators, which in turn can directly performI/O, i.e., access files, other accelerators or host services. Our dis-tributed dataflow runtime runs on top of this substrate. It orchestratesthe distributed execution, connecting disaggregated resources usingdata transfers and inter-device communication, while eliminatingthe performance bottlenecks of the traditional CPU-centric design

    Generating Permutations with Restricted Containers

    Get PDF
    We investigate a generalization of stacks that we call C\mathcal{C}-machines. We show how this viewpoint rapidly leads to functional equations for the classes of permutations that C\mathcal{C}-machines generate, and how these systems of functional equations can frequently be solved by either the kernel method or, much more easily, by guessing and checking. General results about the rationality, algebraicity, and the existence of Wilfian formulas for some classes generated by C\mathcal{C}-machines are given. We also draw attention to some relatively small permutation classes which, although we can generate thousands of terms of their enumerations, seem to not have D-finite generating functions

    Kernel- vs. User-Level Networking: A Ballad of Interrupts and How to Mitigate Them

    Get PDF
    Networking performance has become especially important in the current age with growing demands on services over the Internet. Recent advances in network controllers has exposed bottlenecks in various parts of network processing. User-level networking, which bypasses the operating system's network stack and replaces it with one re-implemented in the userspace, is often framed as a silver bullet to mitigate any performance issues arising in the kernel network stack. However, there is often no comprehensive study on where this performance increase ultimately comes from. This work aims to explore potential areas from which improvements in overall performance can arise. Most importantly, it is identified that asynchronous interrupts and their handling is a major source of overhead associated with the kernel network stack. Several proposals are presented with the goal of reducing the need for interrupts in the kernel network stack, simulating the execution model of user-level networking. It is shown that a small kernel modification with around 30 lines of code change results in a substantial performance increase without the need to replace the kernel network stack in its entirety
    corecore