22 research outputs found
High performance deep packet inspection on multi-core platform
Deep packet inspection (DPI) provides the ability to perform quality of service (QoS) and Intrusion Detection on network packets. But since the explosive growth of Internet, performance and scalability issues have been raised due to the gap between network and end-system speeds. This article describles how a desirable DPI system with multi-gigabits throughput and good scalability should be like by exploiting parallelism on network interface card, network stack and user applications. Connection-based parallelism, affinity-based scheduling and lock-free data structure are the main technologies introduced to alleviate the performance and scalability issues. A common DPI application L7-Filter is used as an example to illustrate the applicaiton level parallelism
Boosting XML Filtering with a Scalable FPGA-based Architecture
The growing amount of XML encoded data exchanged over the Internet increases
the importance of XML based publish-subscribe (pub-sub) and content based
routing systems. The input in such systems typically consists of a stream of
XML documents and a set of user subscriptions expressed as XML queries. The
pub-sub system then filters the published documents and passes them to the
subscribers. Pub-sub systems are characterized by very high input ratios,
therefore the processing time is critical. In this paper we propose a "pure
hardware" based solution, which utilizes XPath query blocks on FPGA to solve
the filtering problem. By utilizing the high throughput that an FPGA provides
for parallel processing, our approach achieves drastically better throughput
than the existing software or mixed (hardware/software) architectures. The
XPath queries (subscriptions) are translated to regular expressions which are
then mapped to FPGA devices. By introducing stacks within the FPGA we are able
to express and process a wide range of path queries very efficiently, on a
scalable environment. Moreover, the fact that the parser and the filter
processing are performed on the same FPGA chip, eliminates expensive
communication costs (that a multi-core system would need) thus enabling very
fast and efficient pipelining. Our experimental evaluation reveals more than
one order of magnitude improvement compared to traditional pub/sub systems.Comment: CIDR 200
Algorithmes et architectures pour l'implémentation de la détection d'expressions réguliÚres
La prochaine gĂ©nĂ©ration de rĂ©seau mobile, la 5G, devrait supporter des latences 10 fois plus faibles avec des dĂ©bits et un nombre dâappareils connectĂ©s 100 fois plus importants
quâaujourdâhui. Dans le mĂȘme temps, les opĂ©rateurs et les gestionnaires de rĂ©seaux veulent des systĂšmes plus modulaires qui puissent sâadapter rapidement aux nouveaux protocoles, mais qui ne consomment pas plus dâĂ©nergie que les solutions actuelles. Les opĂ©rateurs et administrateurs sont donc de plus en plus intĂ©ressĂ©s par des plateformes reconfigurables telles que des FPGA. Cependant, ces plateformes nĂ©cessitent encore des experts pour ĂȘtre
utilisĂ©es et ont des temps de dĂ©veloppement qui peuvent ĂȘtre longs ce qui les rend difficiles Ă intĂ©grer. De plus, les infrastructures informatiques sont des Ă©lĂ©ments de plus en plus critiques pour le fonctionnement de lâĂ©conomie. La sĂ©curitĂ© des rĂ©seaux est donc devenue un point important
pour protĂ©ger ces infrastructures. Actuellement la protection des rĂ©seaux est effectuĂ©e en utilisant des SystĂšmes de DĂ©tection dâIntrusions â Intrusion Detection System (IDS) qui effectuent lâinspection en profondeur de paquets â Deep Packet Inspection (DPI). Pour permettre la protection, les IDS comparent le contenu des paquets transitant sur le rĂ©seau Ă des rĂšgles prĂ©dĂ©terminĂ©es. Ces rĂšgles sont reprĂ©sentĂ©es soit par des chaĂźnes de caractĂšres ou bien des expressions rĂ©guliĂšres. Dans ce mĂ©moire, nous proposons trois contributions en rapport Ă lâutilisation de FPGA pour effectuer de la recherche de texte et dâexpressions rĂ©guliĂšres dans les rĂ©seaux. Ces trois
réalisations sont implémentées sur des FPGA et respectent les contraintes de latence liées aux réseaux.----------ABSTRACT: The next generation of mobile networks, called 5G, is expected to achieve significantly better performance than present networks : latency 10x smaller, throughput 100x higher with 100x more connected devices over the so-called 4G. Moreover, service providers and network administrators will need more configurable systems able to rapidly support new protocols. Furthermore, the power consumption of the resulting network infrastructure remains a critical
consideration. A possible solution to meet all those requirements involves the use of FPGAs. However, the development complexity causes integration difficulties. In addition, computers and data-centers are more and more critical systems. Consequently, security is an important issue. This motivates introducing Intrusion Detection Systems (IDS), which perform Deep Packet Inspection (DPI). IDSs compare the network flow against a set of rules that are expressed with strings and regular expressions. This thesis proposes three contributions in regard to FPGAs utilization for text and regular expression search. Those contributions respect the latency constraint of networks and are implemented into FPGAs
FPGA-based High Throughput Regular Expression Pattern Matching for Network Intrusion Detection Systems
Network speeds and bandwidths have improved over time. However, the frequency of network attacks and illegal accesses have also increased as the network speeds and bandwidths improved over time. Such attacks are capable of compromising the privacy and confidentiality of network resources belonging to even the most secure networks. Currently, general-purpose processor based software solutions used for detecting network attacks have become inadequate in coping with the current network speeds. Hardware-based platforms are designed to cope with the rising network speeds measured in several gigabits per seconds (Gbps). Such hardware-based platforms are capable of detecting several attacks at once, and a good candidate is the Field-programmable Gate Array (FPGA). The FPGA is a hardware platform that can be used to perform deep packet inspection of network packet contents at high speed. As such, this thesis focused on studying designs that were implemented with Field-programmable Gate Arrays (FPGAs). Furthermore, all the FPGA-based designs studied in this thesis have attempted to sustain a more steady growth in throughput and throughput efficiency. Throughput efficiency is defined as the concurrent throughput of a regular expression matching engine circuit divided by the average number of look up tables (LUTs) utilised by each state of the engine"s automata. The implemented FPGA-based design was built upon the concept of equivalence classification. The concept helped to reduce the overall table size of the inputs needed to drive the various Nondeterministic Finite Automata (NFA) matching engines. Compared with other approaches, the design sustained a throughput of up to 11.48 Gbps, and recorded an overall reduction in the number of pattern matching engines required by up to 75%. Also, the overall memory required by the design was reduced by about 90% when synthesised on the target FPGA platform
Based on Regular Expression Matching of Evaluation of the Task Performance in WSN: A Queue Theory Approach
Due to the limited resources of wireless sensor network, low efficiency of real-time communication scheduling, poor safety defects, and so forth, a queuing performance evaluation approach based on regular expression match is proposed, which is a method that consists of matching preprocessing phase, validation phase, and queuing model of performance evaluation phase. Firstly, the subset of related sequence is generated in preprocessing phase, guiding the validation phase distributed matching. Secondly, in the validation phase, the subset of features clustering, the compressed matching table is more convenient for distributed parallel matching. Finally, based on the queuing model, the sensor networks of task scheduling dynamic performance are evaluated. Experiments show that our approach ensures accurate matching and computational efficiency of more than 70%; it not only effectively detects data packets and access control, but also uses queuing method to determine the parameters of task scheduling in wireless sensor networks. The method for medium scale or large scale distributed wireless node has a good applicability
Improving Programming Support for Hardware Accelerators Through Automata Processing Abstractions
The adoption of hardware accelerators, such as Field-Programmable Gate Arrays,
into general-purpose computation pipelines continues to rise, driven by recent
trends in data collection and analysis as well as pressure from challenging
physical design constraints in hardware. The architectural designs of many of
these accelerators stand in stark contrast to the traditional von Neumann model
of CPUs. Consequently, existing programming languages, maintenance tools, and
techniques are not directly applicable to these devices, meaning that additional
architectural knowledge is required for effective programming and configuration.
Current programming models and techniques are akin to assembly-level programming
on a CPU, thus placing significant burden on developers tasked with using these
architectures. Because programming is currently performed at such low levels of
abstraction, the software development process is tedious and challenging and
hinders the adoption of hardware accelerators.
This dissertation explores the thesis that theoretical finite automata provide a
suitable abstraction for bridging the gap between high-level programming models
and maintenance tools familiar to developers and the low-level hardware
representations that enable high-performance execution on hardware accelerators.
We adopt a principled hardware/software co-design methodology to develop a
programming model providing the key properties that we observe are necessary for success,
namely performance and scalability, ease of use, expressive power, and legacy
support.
First, we develop a framework that allows developers to port existing, legacy
code to run on hardware accelerators by leveraging automata learning algorithms
in a novel composition with software verification, string solvers, and
high-performance automata architectures. Next, we design a domain-specific
programming language to aid programmers writing pattern-searching algorithms and
develop compilation algorithms to produce finite automata, which supports
efficient execution on a wide variety of processing architectures. Then, we
develop an interactive debugger for our new language, which allows developers to
accurately identify the locations of bugs in software while maintaining support
for high-throughput data processing. Finally, we develop two new
automata-derived accelerator architectures to support additional applications,
including the detection of security attacks and the parsing of recursive and
tree-structured data. Using empirical studies, logical reasoning, and
statistical analyses, we demonstrate that our prototype artifacts scale to
real-world applications, maintain manageable overheads, and support developers'
use of hardware accelerators. Collectively, the research efforts detailed in
this dissertation help ease the adoption and use of hardware accelerators for
data analysis applications, while supporting high-performance computation.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/155224/1/angstadt_1.pd