Search CORE

11 research outputs found

Fast Packet Processing on High Performance Architectures

Author: ANTICHI GIANNI
Publication venue: 'Pisa University Press'
Publication date: 05/05/2011
Field of study

The rapid growth of Internet and the fast emergence of new network applications have brought great challenges and complex issues in deploying high-speed and QoS guaranteed IP network. For this reason packet classication and network intrusion detection have assumed a key role in modern communication networks in order to provide Qos and security. In this thesis we describe a number of the most advanced solutions to these tasks. We introduce NetFPGA and Network Processors as reference platforms both for the design and the implementation of the solutions and algorithms described in this thesis. The rise in links capacity reduces the time available to network devices for packet processing. For this reason, we show different solutions which, either by heuristic and randomization or by smart construction of state machine, allow IP lookup, packet classification and deep packet inspection to be fast in real devices based on high speed platforms such as NetFPGA or Network Processors

Electronic Thesis and Dissertation Archive - Università di Pisa

In-NIC oyobi In-kernel kyasshu o mochiita maruchi reiya Key-value kyasshu ākitekucha

Author: Tokusashi Yūta
トクサシユウタ
徳差雄太
Publication venue: 慶應義塾大学大学院理工学研究科
Publication date
Field of study

KeiO Academic Resource Archive

Data Structures and Algorithms for Scalable NDN Forwarding

Author: Yuan Haowei
Publication venue: Washington University Open Scholarship
Publication date: 15/12/2015
Field of study

Named Data Networking (NDN) is a recently proposed general-purpose network architecture that aims to address the limitations of the Internet Protocol (IP), while maintaining its strengths. NDN takes an information-centric approach, focusing on named data rather than computer addresses. In NDN, the content is identified by its name, and each NDN packet has a name that specifies the content it is fetching or delivering. Since there are no source and destination addresses in an NDN packet, it is forwarded based on a lookup of its name in the forwarding plane, which consists of the Forwarding Information Base (FIB), Pending Interest Table (PIT), and Content Store (CS). In addition, as an in-network caching element, a scalable Repository (Repo) design is needed to provide large-scale long-term content storage in NDN networks. Scalable NDN forwarding is a challenge. Compared to the well-understood approaches to IP forwarding, NDN forwarding performs lookups on packet names, which have variable and unbounded lengths, increasing the lookup complexity. The lookup tables are larger than in IP, requiring more memory space. Moreover, NDN forwarding has a read-write data plane, requiring per-packet updates at line rates. Designing and evaluating a scalable NDN forwarding node architecture is a major effort within the overall NDN research agenda. The goal of this dissertation is to demonstrate that scalable NDN forwarding is feasible with the proposed data structures and algorithms. First, we propose a FIB lookup design based on the binary search of hash tables that provides a reliable longest name prefix lookup performance baseline for future NDN research. We have demonstrated 10 Gbps forwarding throughput with 256-byte packets and one billion synthetic forwarding rules, each containing up to seven name components. Second, we explore data structures and algorithms to optimize the FIB design based on the specific characteristics of real-world forwarding datasets. Third, we propose a fingerprint-only PIT design that reduces the memory requirements in the core routers. Lastly, we discuss the Content Store design issues and demonstrate that the NDN Repo implementation can leverage many of the existing databases and storage systems to improve performance

Washington University St. Louis: Open Scholarship

Stateful Data Plane Abstractions for Software-Defined Networks and Their Applications

Author: Cascone Carmelo
Publication venue
Publication date: 01/07/2017
Field of study

RESUMÉ Le Software-Defined Networking (SDN) permet la programmation du réseau. Malheureusement, la technologie SDN actuelle limite la programmabilité uniquement au plan de contrôle. Les opérateurs ne peuvent pas programmer des algorithmes du plan de données tels que l’équilibrage de charge, le contrôle de congestion, la détection de pannes, etc. Ces fonctions sont implémentées à l’aide d’hardware dédié, car elles doivent fonctionner au taux de ligne, c’est-à-dire 10-100 Gbit/s sur 10-100 ports. Dans ce travail, nous présentons deux abstractions de plan de données pour le traitement de paquets à états (stateful), OpenState et OPP. OpenState est une extension d’OpenFlow qui permet la définition des règles de flux en tant que machines à états finis. OPP est une abstraction plus flexible qui généralise OpenState en ajoutant des capacités de calcul, permettant la programmation d’algorithmes de plan de données plus avancés. OpenState et OPP sont à la fois disponibles pour les implémentations d’haute performance en utilisant des composants de commutateurs hardware courants. Cependant, les deux abstractions sont basées sur un choix de design problématique : l’utilisation d’une boucle de rétroaction dans le pipeline de traitement des paquets. Cette boucle, si elle n’est pas correctement contrôlée, peut nuire à la cohérence des opérations d’état. Les approches de verrouillage de la mémoire peuvent être utilisées pour éviter les incohérences, au détriment du débit. Nous présentons des résultats de simulations sur des traces de trafic réelles, montrant que les boucles de rétroaction de plusieurs cycles d’horloge peuvent être supportées avec peu ou pas de dégradation des performances, même avec les charges de travail des plus défavorables. Pour mieux prouver les avantages d’un plan de données programmables, nous présentons deux nouvelles applications : Spider et FDPA. Spider permet de détecter et de réagir aux pannes de réseau aux échelles temporelles du plan de données (i.e., micro/nanosecondes), également dans le cas de pannes à distance. En utilisant OpenState, Spider fournit des fonctionnalités équivalentes aux protocoles de plans de contrôle anciens tels que BFD et MPLS Fast Reroute, mais sans nécessiter un plan de contrôle.---------- ABSTRACT Software-Defined Networking (SDN) enables programmability in the network. Unfortunately, current SDN limits programmability only to the control plane. Operators cannot program data plane algorithms such as load balancing, congestion control, failure detection, etc. These capabilities are usually baked in the switch via dedicated hardware, as they need to run at line rate, i.e. 10-100 Gbit/s on 10-100 ports. In this work, we present two data plane abstractions for stateful packet processing, namely OpenState and OPP. These abstractions allow operators to program data plane tasks that involve stateful processing. OpenState is an extension to OpenFlow that permits the definition of forwarding rules as finite state machines. OPP is a more flexible abstraction that generalizes OpenState by adding computational capabilities, opening for the programming of more advanced data plane algorithms. Both OpenState and OPP are amenable for highperformance hardware implementations by using commodity hardware switch components. However, both abstractions are based on a problematic design choice: to use a feedback-loop in the processing pipeline. This loop, if not adequately controlled, can represent a harm for the consistency of the state operations. Memory locking approaches can be used to prevent inconsistencies, at the expense of throughput. We present simulation results on real traffic traces showing that feedback-loops of several clock cycles can be supported with little or no performance degradation, even with near-worst case traffic workloads. To further prove the benefits of a stateful programmable data plane, we present two novel applications: Spider and FDPA. Spider permits to detect and react to network failures at data plane timescales, i.e. micro/nanoseconds, also in the case of distant failures. By using OpenState, Spider provides functionalities equivalent to legacy control plane protocols such as BFD and MPLS Fast Reroute, but without the need of a control plane. That is, both detection and rerouting happen entirely in the data plane. FDPA allows a switch to enforce approximate fair bandwidth sharing among many TCP-like senders. Most of the mechanisms to solve this problem are based on complex scheduling algorithms, whose feasibility becomes very expensive with today’s line rate requirements. FDPA, which is based on OPP, trades scheduling complexity with per-user state. FDPA works by dynamically assigning users to few (3-4) priority queues, where the priority is chosen based on the sending rate history of a user

PolyPublie

Hardware acceleration for power efficient deep packet inspection

Author: Zhou Yachao
Publication venue: Dublin City University. Research Institute for Networks and Communications Engineering (RINCE)
Publication date: 01/11/2012
Field of study

The rapid growth of the Internet leads to a massive spread of malicious attacks like viruses and malwares, making the safety of online activity a major concern. The use of Network Intrusion Detection Systems (NIDS) is an effective method to safeguard the Internet. One key procedure in NIDS is Deep Packet Inspection (DPI). DPI can examine the contents of a packet and take actions on the packets based on predefined rules. In this thesis, DPI is mainly discussed in the context of security applications. However, DPI can also be used for bandwidth management and network surveillance. DPI inspects the whole packet payload, and due to this and the complexity of the inspection rules, DPI algorithms consume significant amounts of resources including time, memory and energy. The aim of this thesis is to design hardware accelerated methods for memory and energy efficient high-speed DPI. The patterns in packet payloads, especially complex patterns, can be efficiently represented by regular expressions, which can be translated by the use of Deterministic Finite Automata (DFA). DFA algorithms are fast but consume very large amounts of memory with certain kinds of regular expressions. In this thesis, memory efficient algorithms are proposed based on the transition compressions of the DFAs. In this work, Bloom filters are used to implement DPI on an FPGA for hardware acceleration with the design of a parallel architecture. Furthermore, devoted at a balance of power and performance, an energy efficient adaptive Bloom filter is designed with the capability of adjusting the number of active hash functions according to current workload. In addition, a method is given for implementation on both two-stage and multi-stage platforms. Nevertheless, false positive rates still prevents the Bloom filter from extensive utilization; a cache-based counting Bloom filter is presented in this work to get rid of the false positives for fast and precise matching. Finally, in future work, in order to estimate the effect of power savings, models will be built for routers and DPI, which will also analyze the latency impact of dynamic frequency adaption to current traffic. Besides, a low power DPI system will be designed with a single or multiple DPI engines. Results and evaluation of the low power DPI model and system will be produced in future

DCU Online Research Access Service

Enabling Independent Communication for FPGAs in High Performance Computing

Author: Lant Joshua
Publication venue
Publication date: 01/08/2020
Field of study

The University of Manchester - Institutional Repository

Reducing the Cost of Operating a Datacenter Network

Author: Curtis Andrew
Publication venue: 'University of Waterloo'
Publication date: 01/01/2012
Field of study

Datacenters are a significant capital expense for many enterprises. Yet, they are difficult to manage and are hard to design and maintain. The initial design of a datacenter network tends to follow vendor guidelines, but subsequent upgrades and expansions to it are mostly ad hoc, with equipment being upgraded piecemeal after its amortization period runs out and equipment acquisition is tied to budget cycles rather than changes in workload. These networks are also brittle and inflexible. They tend to be manually managed, and cannot perform dynamic traffic engineering. The high-level goal of this dissertation is to reduce the total cost of owning a datacenter by improving its network. To achieve this, we make the following contributions. First, we develop an automated, theoretically well-founded approach to planning cost-effective datacenter upgrades and expansions. Second, we propose a scalable traffic management framework for datacenter networks. Together, we show that these contributions can significantly reduce the cost of operating a datacenter network. To design cost-effective network topologies, especially as the network expands over time, updated equipment must coexist with legacy equipment, which makes the network heterogeneous. However, heterogeneous high-performance network designs are not well understood. Our first step, therefore, is to develop the theory of heterogeneous Clos topologies. Using our theory, we propose an optimization framework, called LEGUP, which designs a heterogeneous Clos network to implement in a new or legacy datacenter. Although effective, LEGUP imposes a certain amount of structure on the network. To deal with situations when this is infeasible, our second contribution is a framework, called REWIRE, which using optimization to design unstructured DCN topologies. Our results indicate that these unstructured topologies have up to 100-500\% more bisection bandwidth than a fat-tree for the same dollar cost. Our third contribution is two frameworks for datacenter network traffic engineering. Because of the multiplicity of end-to-end paths in DCN fabrics, such as Clos networks and the topologies designed by REWIRE, careful traffic engineering is needed to maximize throughput. This requires timely detection of elephant flows---flows that carry large amount of data---and management of those flows. Previously proposed approaches incur high monitoring overheads, consume significant switch resources, or have long detection times. We make two proposals for elephant flow detection. First, in the Mahout framework, we suggest that such flows be detected by observing the end hosts' socket buffers, which provide efficient visibility of flow behavior. Second, in the DevoFlow framework, we add efficient stats-collection mechanisms to network switches. Using simulations and experiments, we show that these frameworks reduce traffic engineering overheads by at least an order of magnitude while still providing near-optimal performance

University of Waterloo's Institutional Repository

Network-Compute Co-Design for Distributed In-Memory Computing

Author: Daglis Alexandros
Publication venue: Lausanne, EPFL
Publication date: 13/09/2018
Field of study

The booming popularity of online services is rapidly raising the demands for modern datacenters. In order to cope with data deluge, growing user bases, and tight quality of service constraints, service providers deploy massive datacenters with tens to hundreds of thousands of servers, keeping petabytes of latency-critical data memory resident. Such data distribution and the multi-tiered nature of the software used by feature-rich services results in frequent inter-server communication and remote memory access over the network. Hence, networking takes center stage in datacenters. In response to growing internal datacenter network traffic, networking technology is rapidly evolving. Lean user-level protocols, like RDMA, and high-performance fabrics have started making their appearance, dramatically reducing datacenter-wide network latency and offering unprecedented per-server bandwidth. At the same time, the end of Dennard scaling is grinding processor performance improvements to a halt. The net result is a growing mismatch between the per-server network and compute capabilities: it will soon be difficult for a server processor to utilize all of its available network bandwidth. Restoring balance between network and compute capabilities requires tighter co-design of the two. The network interface (NI) is of particular interest, as it lies on the boundary of network and compute. In this thesis, we focus on the design of an NI for a lightweight RDMA-like protocol and its full integration with modern manycore server processors. The NI capabilities scale with both the increasing network bandwidth and the growing number of cores on modern server processors. Leveraging our architecture's integrated NI logic, we introduce new functionality at the network endpoints that yields performance improvements for distributed systems. Such additions include new network operations with stronger semantics tailored to common application requirements and integrated logic for balancing network load across a modern processor's multiple cores. We make the case that exposing richer, end-to-end semantics to the NI is a unique enabler for optimizations that can reduce software complexity and remove significant load from the processor, contributing towards maintaining balance between the two valuable resources of network and compute. Overall, network-compute co-design is an approach that addresses challenges associated with the emerging technological mismatch of compute and networking capabilities, yielding significant performance improvements for distributed memory systems

Infoscience - École polytechnique fédérale de Lausanne

Model-driven Security Engineering for FPGAs

Author: Vetter Michael
Publication venue: Západočeská univerzita v Plzni
Publication date: 28/08/2020
Field of study

Tato práce obsahuje analýzu a adaptaci vhodných metod zabezpečení, pocházejících ze softwarové domény, do světa FPGA. Metoda formalizace bezpečnostní výzvy FPGA je prezentována jazykem FPGASECML, specifickým pro danou doménu, vhodným pro modelování hrozeb zaměřených na systém a pro formální definování bezpečnostní politiky. Vytvoření vhodných obranných mechanismů vyžaduje inteligenci o agentech ohrožení, zejména o jejich motivaci a schopnostech. Konstrukce založené na FPGA jsou, stejně jako jakýkoli jiný IT systém, vystaveny různým agentům hrozeb po celou dobu jejich životnosti, což naléhavě vyžaduje potřebu vhodné a přizpůsobitelné bezpečnostní strategie. Systematická analýza návrhu založená na konceptu STRIDE poskytuje cenné informace o hrozbách a požadovaných mechanismech protiopatření. Minimalizace povrchu útoku je jedním z nezbytných kroků k vytvoření odolného designu. Konvenční paradigmata řízení přístupu mohou modelovat pravidla řízení přístupu v návrzích FPGA. Výběr vhodného závisí na složitosti a bezpečnostních požadavcích návrhu. Formální popis architektury FPGA a bezpečnostní politiky podporuje přesnou definici aktiv a jejich možných, povolených a zakázaných interakcí. Odstraňuje nejednoznačnost z modelu hrozby a zároveň poskytuje plán implementace. Kontrola modelu může být použita k ověření, zda a do jaké míry, je návrh v souladu s uvedenou bezpečnostní politikou. Přenesení architektury do vhodného modelu a bezpečnostní politiky do ověřitelných logických vlastností může být, jak je uvedeno v této práci, automatizované, zjednodušující proces a zmírňující jeden zdroj chyb. Posílení učení může identifikovat potenciální slabiny a kroky, které může útočník podniknout, aby je využil. Některé metody zde uvedené mohou být použitelné také v jiných doménách.ObhájenoThe thesis provides an analysis and adaptation of appropriate security methods from the software domain into the FPGA world and combines them with formal verification methods and machine learning techniques. The deployment of appropriate defense mechanisms requires intelligence about the threat agents, especially their motivation and capabilities. FPGA based designs are, like any other IT system, exposed to different threat agents throughout the systems lifetime, urging the need for a suitable and adaptable security strategy. The systematic analysis of the design, based on the STRIDE concept, provides valuable insight into the threats and the mandated counter mechanisms. Minimizing the attack surface is one essential step to create a resilient design. Conventional access control paradigms can model access control rules in FPGA designs and thereby restrict the exposure of sensitive elements to untrustworthy ones. A method to formalize the FPGA security challenge is presented. FPGASECML is a domain-specific language, suitable for dataflow-centric threat modeling as well as the formal definition of an enforceable security policy. The formal description of the FPGA architecture and the security policy promotes a precise definition of the assets and their possible, allowed, and prohibited interactions. Formalization removes ambiguity from the threat model while providing a blueprint for the implementation. Model transformations allow the application of dedicated and proven tools to answer specific questions while minimizing the workload for the user. Model-checking can be applied to verify if, and to a certain degree when, a design complies with the stated security policy. Transferring the architecture into a suitable model and the security policy into verifiable logic properties can be, as demonstrated in the thesis, automated, simplifying the process and mitigating one source of error. Reinforcement learning, a machine learning method, can identify potential weaknesses and the steps an attacker may take to exploit them. The approach presented uses a Markov Decision Process in combination with a Qlearning algorithm

DSpace at University of West Bohemia

A Randomized Scheme for IP Lookup at Wire Speed on NetFPGA

Author: Antichi G
Di Pietro A
Ficara D
Giordano S
Procissi G
Vitucci F
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Because of the rapid growth of both traffic and links capacity, the time budget to perform IP address lookup on a packet continues to decrease and lookup tables of routers unceasingly grow. Therefore, new lookup algorithms and new hardware platform are required to perform fast IP lookup. This paper presents a new scheme on top of the NetFPGA board which takes advantage of parallel queries made on perfect hash functions. Such functions are built by using a very compact and fast data structure called Blooming Trees, thus allowing the vast majority of memory accesses to involve small and fast on-chip memories only

Archivio della Ricerca - Università di Pisa