Search CORE

3,498 research outputs found

A Parallel Computational Approach for String Matching- A Novel Structure with Omega Model

Author: K Butchi Raju
Publication venue: Global Journals Inc. (US)
Publication date: 28/02/2013
Field of study

In r e cent day2019;s parallel string matching problem catch the attention of so many researchers because of the importance in different applications like IRS, Genome sequence, data cleaning etc.,. While it is very easily stated and many of the simple algorithms perform very well in practice, numerous works have been published on the subject and research is still very active. In this paper we propose a omega parallel computing model for parallel string matching. The algorithm is designed to work on omega model pa rallel architecture where text is divided for parallel processing and special searching at division point is required for consistent and complete searching. This algorithm reduces the number of comparisons and parallelization improves the time efficiency. Experimental results show that, on a multi - processor system, the omega model implementation of the proposed parallel string matching algorithm can reduce string matching time

Global Journal of Computer Science and Technology (GJCST)

A CONTENT-ADDRESSABLE-MEMORY ASSISTED INTRUSION PREVENTION EXPERT SYSTEM FOR GIGABIT NETWORKS

Author: Yu Ying
Publication venue
Publication date: 31/01/2007
Field of study

Cyber intrusions have become a serious problem with growing frequency and complexity. Current Intrusion Detection/Prevention Systems (IDS/IPS) are deficient in speed and/or accuracy. Expert systems are one functionally effective IDS/IPS method. However, they are in general computationally intensive and too slow for real time requirements. This poor performance prohibits expert system's applications in gigabit networks. This dissertation describes a novel intrusion prevention expert system architecture that utilizes the parallel search capability of Content Addressable Memory (CAM) to perform intrusion detection at gigabit/second wire speed. A CAM is a parallel search memory that compares all of its entries against input data in parallel. This parallel search is much faster than the serial search operation in Random Access Memory (RAM). The major contribution of this thesis is to accelerate the expert system's performance bottleneck "match" processes using the parallel search power of a CAM, thereby enabling the expert systems for wire speed network IDS/IPS applications. To map an expert system's Match process into a CAM, this research introduces a novel "Contextual Rule" (C-Rule) method that fundamentally changes expert systems' computational structures without changing its functionality for the IDS/IPS problem domain. This "Contextual Rule" method combines expert system rules and current network states into a new type of dynamic rule that exists only under specific network state conditions. This method converts the conventional two-database match process into a one-database search process. Therefore it enables the core functionality of the expert system to be mapped into a CAM and take advantage of its search parallelism.This thesis also introduces the CAM-Assisted Intrusion Prevention Expert System (CAIPES) architecture and shows how it can support the vast majority of the rules in the 1999 Lincoln Lab's DARPA Intrusion Detection Evaluation data set, and rules in the open source IDS "Snort". Supported rules are able to detect single-packet attacks, abusive traffic and packet flooding attacks, sequences of packets attacks, and flooding of sequences attacks. Prototyping and simulation have been performed to demonstrate the detection capability of these four types of attacks. Hardware simulation of an existing CAM shows that the CAIPES architecture enables gigabit/s IDS/IPS

D-Scholarship@Pitt

High performance stride-based network payload inspection

Author: Wang Xiaofei
Publication venue: Dublin City University. Research Institute for Networks and Communications Engineering (RINCE)
Publication date: 01/11/2012
Field of study

There are two main drivers for network payload inspection: malicious data, attacks, virus detection in Network Intrusion Detection System (NIDS) and content detection in Data Leakage Prevention System (DLPS) or Copyright Infringement Detection System (CIDS). Network attacks are getting more and more prevalent. Traditional network firewalls can only check the packet header, but fail to detect attacks hidden in the packet payload. Therefore, the NIDS with Deep Packet Inspection (DPI) function has been developed and widely deployed. By checking each byte of a packet against the pattern set, which is called pattern matching, NIDS is able to detect the attack codes hidden in the payload. The pattern set is usually organized as a Deterministic Finite Automata (DFA). The processing time of DFA is proportional to the length of the input string, but the memory cost of a DFA is quite large. Meanwhile, the link bandwidth and the traffic of the Internet are rapidly increasing, the size of the attack signature database is also growing larger and larger due to the diversification of the attacks. Consequently, there is a strong demand for high performance and low storage cost NIDS. Traditional softwarebased and hardware-based pattern matching algorithms are have difficulty satisfying the processing speed requirement, thus high performance network payload inspection methods are needed to enable deep packet inspection at line rate. In this thesis, Stride Finite Automata (StriFA), a novel finite automata family to accelerate both string matching and regular expression matching, is presented. Compared with the conventional finite automata, which scan the entire traffic stream to locate malicious information, the StriFA only needs to scan samples of the traffic stream to find the suspicious information, thus increasing the matching speed and reducing memory requirements. Technologies such as instant messaging software (Skype, MSN) or BitTorrent file sharing methods, allow convenient sharing of information between managers, employees, customers, and partners. This, however, leads to two kinds of major security risks when exchanging data between different people: firstly, leakage of sensitive data from a company and, secondly, distribution of copyright infringing products in Peer to Peer (P2P) networks. Traditional DFA-based DPI solutions cannot be used for inspection of file distribution in P2P networks due to the potential out-of-order manner of the data delivery. To address this problem, a hybrid finite automaton called Skip-Stride-Neighbor Finite Automaton (S2NFA) is proposed to solve this problem. It combines benefits of the following three structures: 1) Skip-FA, which is used to solve the out-of-order data scanning problem; 2) Stride-DFA, which is introduced to reduce the memory usage of Skip-FA; 3) Neighbor-DFA which is based on the characteristics of Stride-DFA to get a low false positive rate at the additional cost of a small increase in memory consumption

Irish Universities

DCU Online Research Access Service

Faster Compression of Deterministic Finite Automata

Author: Bille Philip
Gørtz Inge Li
Pedersen Max Rishøj
Publication venue
Publication date: 22/06/2023
Field of study

Deterministic finite automata (DFA) are a classic tool for high throughput matching of regular expressions, both in theory and practice. Due to their high space consumption, extensive research has been devoted to compressed representations of DFAs that still support efficient pattern matching queries. Kumar~et~al.~[SIGCOMM 2006] introduced the \emph{delayed deterministic finite automaton} (\ddfa{}) which exploits the large redundancy between inter-state transitions in the automaton. They showed it to obtain up to two orders of magnitude compression of real-world DFAs, and their work formed the basis of numerous subsequent results. Their algorithm, as well as later algorithms based on their idea, have an inherent quadratic-time bottleneck, as they consider every pair of states to compute the optimal compression. In this work we present a simple, general framework based on locality-sensitive hashing for speeding up these algorithms to achieve sub-quadratic construction times for \ddfa{}s. We apply the framework to speed up several algorithms to near-linear time, and experimentally evaluate their performance on real-world regular expression sets extracted from modern intrusion detection systems. We find an order of magnitude improvement in compression times, with either little or no loss of compression, or even significantly better compression in some cases

arXiv.org e-Print Archive

CHID : conditional hybrid intrusion detection system for reducing false positives and resource consumption on malicous datasets

Author: Alaidaros Hashem Mohammed
Publication venue
Publication date: 01/01/2017
Field of study

Inspecting packets to detect intrusions faces challenges when coping with a high volume of network traffic. Packet-based detection processes every payload on the wire, which degrades the performance of network intrusion detection system (NIDS). This issue requires an introduction of a flow-based NIDS that reduces the amount of data to be processed by examining aggregated information of related packets. However, flow-based detection still suffers from the generation of the false positive alerts due to incomplete data input. This study proposed a Conditional Hybrid Intrusion Detection (CHID) by combining the flow-based with packet-based detection. In addition, it is also aimed to improve the resource consumption of the packet-based detection approach. CHID applied attribute wrapper features evaluation algorithms that marked malicious flows for further analysis by the packet-based detection. Input Framework approach was employed for triggering packet flows between the packetbased and flow-based detections. A controlled testbed experiment was conducted to evaluate the performance of detection mechanism’s CHID using datasets obtained from on different traffic rates. The result of the evaluation showed that CHID gains a significant performance improvement in terms of resource consumption and packet drop rate, compared to the default packet-based detection implementation. At a 200 Mbps, CHID in IRC-bot scenario, can reduce 50.6% of memory usage and decreases 18.1% of the CPU utilization without packets drop. CHID approach can mitigate the false positive rate of flow-based detection and reduce the resource consumption of packet-based detection while preserving detection accuracy. CHID approach can be considered as generic system to be applied for monitoring of intrusion detection systems

Universiti Utara Malaysia: UUM eTheses

Exact string matching algorithms : survey, issues, and future research directions

Author: Gilkar Gulshan
Hakak Saqib
Imran Muhammad
Kamsin Amirrudin
Khan Wazir
Shivakumara Palaiahnakote
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

String matching has been an extensively studied research domain in the past two decades due to its various applications in the fields of text, image, signal, and speech processing. As a result, choosing an appropriate string matching algorithm for current applications and addressing challenges is difficult. Understanding different string matching approaches (such as exact string matching and approximate string matching algorithms), integrating several algorithms, and modifying algorithms to address related issues are also difficult. This paper presents a survey on single-pattern exact string matching algorithms. The main purpose of this survey is to propose new classification, identify new directions and highlight the possible challenges, current trends, and future works in the area of string matching algorithms with a core focus on exact string matching algorithms. © 2013 IEEE

Federation ResearchOnline

Hardware-Aware Algorithm Designs for Efficient Parallel and Distributed Processing

Author: Stylianopoulos Charalampos
Publication venue
Publication date: 01/01/2020
Field of study

The introduction and widespread adoption of the Internet of Things, together with emerging new industrial applications, bring new requirements in data processing. Specifically, the need for timely processing of data that arrives at high rates creates a challenge for the traditional cloud computing paradigm, where data collected at various sources is sent to the cloud for processing. As an approach to this challenge, processing algorithms and infrastructure are distributed from the cloud to multiple tiers of computing, closer to the sources of data. This creates a wide range of devices for algorithms to be deployed on and software designs to adapt to.In this thesis, we investigate how hardware-aware algorithm designs on a variety of platforms lead to algorithm implementations that efficiently utilize the underlying resources. We design, implement and evaluate new techniques for representative applications that involve the whole spectrum of devices, from resource-constrained sensors in the field, to highly parallel servers. At each tier of processing capability, we identify key architectural features that are relevant for applications and propose designs that make use of these features to achieve high-rate, timely and energy-efficient processing.In the first part of the thesis, we focus on high-end servers and utilize two main approaches to achieve high throughput processing: vectorization and thread parallelism. We employ vectorization for the case of pattern matching algorithms used in security applications. We show that re-thinking the design of algorithms to better utilize the resources available in the platforms they are deployed on, such as vector processing units, can bring significant speedups in processing throughout. We then show how thread-aware data distribution and proper inter-thread synchronization allow scalability, especially for the problem of high-rate network traffic monitoring. We design a parallelization scheme for sketch-based algorithms that summarize traffic information, which allows them to handle incoming data at high rates and be able to answer queries on that data efficiently, without overheads.In the second part of the thesis, we target the intermediate tier of computing devices and focus on the typical examples of hardware that is found there. We show how single-board computers with embedded accelerators can be used to handle the computationally heavy part of applications and showcase it specifically for pattern matching for security-related processing. We further identify key hardware features that affect the performance of pattern matching algorithms on such devices, present a co-evaluation framework to compare algorithms, and design a new algorithm that efficiently utilizes the hardware features.In the last part of the thesis, we shift the focus to the low-power, resource-constrained tier of processing devices. We target wireless sensor networks and study distributed data processing algorithms where the processing happens on the same devices that generate the data. Specifically, we focus on a continuous monitoring algorithm (geometric monitoring) that aims to minimize communication between nodes. By deploying that algorithm in action, under realistic environments, we demonstrate that the interplay between the network protocol and the application plays an important role in this layer of devices. Based on that observation, we co-design a continuous monitoring application with a modern network stack and augment it further with an in-network aggregation technique. In this way, we show that awareness of the underlying network stack is important to realize the full potential of the continuous monitoring algorithm.The techniques and solutions presented in this thesis contribute to better utilization of hardware characteristics, across a wide spectrum of platforms. We employ these techniques on problems that are representative examples of current and upcoming applications and contribute with an outlook of emerging possibilities that can build on the results of the thesis

Chalmers Research