Search CORE

67 research outputs found

Recommended from our members

System Design and Implementation for Hybrid Network Function Virtualization

Author: Zhang Xuzhi
Publication venue: ScholarWorks@UMass Amherst
Publication date: 18/12/2020
Field of study

With the application of virtualization technology in computer networks, many new research areas and techniques have been explored, such as network function virtualization (NFV). A significant benefit of virtualization is that it reduces the cost of a network system and increases its flexibility. Due to the increasing complexity of the network environment and constantly improving network scale and bandwidth, it is imperative to aim for higher performance, extensibility, and flexibility in the future network systems. In this dissertation, hybrid NFV platforms applying virtualization technology are proposed. We further explore the techniques used to improve the performance, scalability and resilience of these systems. In the first part of this dissertation, we describe a new heterogeneous hardware-software NFV platform that provides scalability and programmability while supporting significant hardware-level parallelism and reconfiguration. Our computing platform takes advantage of both field-programmable gate arrays (FPGAs) and microprocessors to implement numerous virtual network functions (VNFs) that can be dynamically customized to specific network flow needs. Traffic management and hardware reconfiguration functions are performed by a global coordinator which allows for the rapid sharing of network function states and continuous evaluation of network function needs. With the help of state sharing mechanism offered by the coordinator, customer-defined VNF instances can be easily migrated between heterogeneous middleboxes as the network environment changes. A resource allocation algorithm dynamically assesses resource deployments as network flows and conditions are updated. In the second part of this thesis document, we explore a new session-level approach for NFV that implements distributed agents in heterogeneous middleboxes to steer packets belonging to different sessions through session-specific service chains. Our session-level approach supports inter-domain service chaining with both FPGA- and processor-based middleboxes, dynamic reconfiguration of service chains for ongoing sessions, and the application of session-level approaches for UDP-based protocols. To demonstrate our approach, we establish inter-domain service chains for QUIC sessions, and reconfigure the service chains across a range of FPGA- and processor-based middleboxes. We show that our session-level approach can successfully reconfigure service chains for individual QUIC sessions. Compared with software implementations, the distributed agents implemented on FPGAs show better performance in various test scenarios

ScholarWorks@UMass Amherst

P4-enabled Smart NIC:Enabling Sliceable and Service-Driven Optical Data Centres

Author: Farhadi Beldachi Arash
Nejabati Reza
Simeonidou Dimitra
Yan Yan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2020
Field of study

Explore Bristol Research

Programmable Smart NIC

Author: Yan Yan
Publication venue
Publication date: 01/06/2024
Field of study

Explore Bristol Research

The future roadmap of in-vehicle network processing: a HW-centric (R-)evolution

Author: Fons Francesc
González Mariño Ángela
Moreno Aróstegui Juan Manuel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/06/2022
Field of study

© 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.The automotive industry is undergoing a deep revolution. With the race towards autonomous driving, the amount of technologies, sensors and actuators that need to be integrated in the vehicle increases exponentially. This imposes new great challenges in the vehicle electric/electronic (E/E) architecture and, especially, in the In-Vehicle Network (IVN). In this work, we analyze the evolution of IVNs, and focus on the main network processing platform integrated in them: the Gateway (GW). We derive the requirements of Network Processing Platforms that need to be fulfilled by future GW controllers focusing on two perspectives: functional requirements and structural requirements. Functional requirements refer to the functionalities that need to be delivered by these network processing platforms. Structural requirements refer to design aspects which ensure the feasibility, usability and future evolution of the design. By focusing on the Network Processing architecture, we review the available options in the state of the art, both in industry and academia. We evaluate the strengths and weaknesses of each architecture in terms of the coverage provided for the functional and structural requirements. In our analysis, we detect a gap in this area: there is currently no architecture fulfilling all the requirements of future automotive GW controllers. In light of the available network processing architectures and the current technology landscape, we identify Hardware (HW) accelerators and custom processor design as a key differentiation factor which boosts the devices performance. From our perspective, this points to a need - and a research opportunity - to explore network processing architectures with a strong HW focus, unleashing the potential of next-generation network processors and supporting the demanding requirements of future autonomous and connected vehicles.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

High-performance software packet processing

Author: Fu Qiaobin
Publication venue
Publication date: 30/01/2021
Field of study

In today’s Internet, it is highly desirable to have fast and scalable software packet processing solutions for network applications that run on commodity hardware. The advent of cloud computing drives the continued rapid growth of Internet traffic. Moreover, the development of emerging networking techniques, such as Network Function Virtualization, significantly shapes the need for implementing the network functions in software. Finally, with the advancement of modern platforms as well as software frameworks for packet processing, network applications have potential to process 100+ Gbps network traffic on a single commodity server. Representative frameworks include the Click modular router, the RouteBricks scalable routing architecture, and BUFFALO, the software-based Ethernet switch. Beneath this general-purpose routing and switching functionality lie a broad set of network applications, many of which are handled with custom methods to provide cost-effectiveness and flexibility. This thesis considers two long-standing networking applications, IP lookup and distributed denial-of-service (DDoS) mitigation, and proposes efficient software-based methods drawing from this new perspective. In this thesis, we first introduce several optimization techniques to accelerate network applications by taking advantage of modern CPU features. Then, we explore the IP lookup problem to find the longest matching prefix of an IP address in a set of prefixes. An ideal IP lookup algorithm should achieve small constant IP lookup time, and on-chip memory usage. However, no prior IP lookup algorithm achieves both requirements at the same time. We propose SAIL, a splitting approach to IP lookup, and a suite of algorithms for IP lookup based on SAIL framework. We conducted extensive experiments to evaluate our algorithms, and experimental results show that our SAIL algorithms are much faster than well-known IP lookup algorithms. Next, we switch our focus to DDoS, an attempt to disrupt the legitimate traffic of a victim by sending a flood of Internet traffic from different sources. Our solution is Gatekeeper, the first open-source and deployable DDoS mitigation system. We present a series of optimization techniques, including use of modern platforms, group prefetching, coroutines, and hashing, to accelerate Gatekeeper. Experimental results show that these optimization techniques significantly improve its performance over alternative baseline solutions.2022-01-30T00:00:00

Boston University Institutional Repository (OpenBU)

Framework for Hardware Acceleration of 400Gb Networks

Author: Hummel Václav
Publication venue: Vysoké učení technické v Brně. Fakulta informačních technologií
Publication date: 01/01/2017
Field of study

Platforma NetCOPE již prokázala svou životaschopnost jako framework pro rychlý vývoj hardwarově akcelerovaných síťových aplikací. Tyto aplikace využívají virtualizované síťové funkce (NFV). Aby platforma v nejbližších letech nezastarala, tak se musí přizpůsobit požadavkům souvisejících s příchodem 400 Gigabitového ethernetu. Příchod 400 Gigabitového ethernetu sebou přináší velké množství výzev, které vyžadují nutnost kompletně změnit dosavadní myšlení. Během jediného hodinového cyklu musí být zpracováno několik síťových paketů, což vyžaduje nový koncept zpracování. Je použita pokročilá správa paměti, aby byla dosažena konstantní paměťová složitost vzhledem k počtu DMA kanálů. Díky tomu je možné realizovat se současnou technologií i více než 256 kompletně nezávislých DMA kanálů. Mnoho úsilí bylo kladeno na vytvoření co nejobecnějšího frameworku i pro vyšší rychlosti přesahující 400 Gb/s. Práce se zaměřuje na komunikaci mezi frameworkem a hostitelským počítačem skrze PCI Express rozhraní. Byla vzata do úvahy i varianta s více síťovými rozhraními. Navržený systém je připraven na nasazení na kartách rodiny COMBO, které jsou použity jako referenční platforma.The NetCOPE framework has proven itself as a viable framework for rapid development of hardware accelerated wire-speed network applications using Network Functions Virtualization (NFV). To meet the current and future requirements of such applications the NetCOPE platform has to catch up with upcoming 400 Gigabit Ethernet. Otherwise, it may become deprecated in following years. Catching up with 400 Gigabit Ethernet brings many challenges bringing necessity of completely different way of thinking. Multiple network packets have to be processed each clock cycle requiring a new concept of processing. Advanced memory management is used to ensure constant memory complexity with respect to the number of DMA channels without any impact on performance. Thanks to that, even more than 256 completely independent DMA channels are feasible with current technology. A lot of effort was made to create the framework as generic as possible allowing deployment of 400 Gigabit Ethernet and beyond. Emphasis is put on communication between the framework and host computer via PCI Express technology. Multiple Ethernet ports are also considered. The proposed system is prepared to be deployed on the family of COMBO cards, used as a reference platform.

Digital library of Brno University of Technology

National Repository of Grey Literature

A Quantitative Analysis and Guideline of Data Streaming Accelerator in Intel 4th Gen Xeon Scalable Processors

Author: Hu Jiayu
Jeong Ipoom
Kim Nam Sung
Kuper Reese
Ranganathan Narayan
Wang Ren
Yuan Yifan
Publication venue
Publication date: 31/05/2023
Field of study

As semiconductor power density is no longer constant with the technology process scaling down, modern CPUs are integrating capable data accelerators on chip, aiming to improve performance and efficiency for a wide range of applications and usages. One such accelerator is the Intel Data Streaming Accelerator (DSA) introduced in Intel 4th Generation Xeon Scalable CPUs (Sapphire Rapids). DSA targets data movement operations in memory that are common sources of overhead in datacenter workloads and infrastructure. In addition, it becomes much more versatile by supporting a wider range of operations on streaming data, such as CRC32 calculations, delta record creation/merging, and data integrity field (DIF) operations. This paper sets out to introduce the latest features supported by DSA, deep-dive into its versatility, and analyze its throughput benefits through a comprehensive evaluation. Along with the analysis of its characteristics, and the rich software ecosystem of DSA, we summarize several insights and guidelines for the programmer to make the most out of DSA, and use an in-depth case study of DPDK Vhost to demonstrate how these guidelines benefit a real application

arXiv.org e-Print Archive

Consensus protocols exploiting network programmability

Author: Dang Huynh Tu
Pedone Fernando
Soulé Robert
Publication venue
Publication date: 22/03/2019
Field of study

Services rely on replication mechanisms to be available at all time. The service demanding high availability is replicated on a set of machines called replicas. To maintain the consistency of replicas, a consensus protocol such as Paxos or Raft is used to synchronize the replicas' state. As a result, failures of a minority of replicas will not affect the service as other non-faulty replicas continue serving requests. A consensus protocol is a procedure to achieve an agreement among processors in a distributed system involving unreliable processors. Unfortunately, achieving such an agreement involves extra processing on every request, imposing a substantial performance degradation. Consequently, performance has long been a concern for consensus protocols. Although many efforts have been made to improve consensus performance, it continues to be an important problem for researchers. This dissertation presents a novel approach to improving consensus performance. Essentially, it exploits the programmability of a new breed of network devices to accelerate consensus protocols that traditionally run on commodity servers. The benefits of using programmable network devices to run consensus protocols are twofold: The network switches process packets faster than commodity servers and consensus messages travel fewer hops in the network. It means that the system throughput is increased and the latency of requests is reduced. The evaluation of our network-accelerated consensus approach shows promising results. Individual components of our FPGA- based and switch-based consensus implementations can process 10 million and 2.5 billion consensus messages per second, respectively. Our FPGA-based system as a whole delivers 4.3 times performance of a traditional software consensus implementation. The latency is also better for our system and is only one third of the latency of the software consensus implementation when both systems are under half of their maximum throughputs. In order to drive even higher performance, we apply a partition mechanism to our switch-based system, leading to 11 times better throughput and 5 times better latency. By dynamically switching between software-based and network-based implementations, our consensus systems not only improve performance but also use energy more efficiently. Encouraged by those benefits, we developed a fault-tolerant non-volatile memory system. A prototype using software memory controller demonstrated reasonable overhead over local memory access, showing great promise as scalable main memory. Our network-based consensus approach would have a great impact in data centers. It not only improves performance of replication mechanisms which relied on consensus, but also enhances performance of services built on top of those replication mechanisms. Our approach also motivates others to move new functionalities into the network, such as, key-value store and stream processing. We expect that in the near future, applications that typically run on traditional servers will be folded into networks for performance

RERO DOC Digital Library