997 research outputs found

    Exact string matching algorithms : survey, issues, and future research directions

    Get PDF
    String matching has been an extensively studied research domain in the past two decades due to its various applications in the fields of text, image, signal, and speech processing. As a result, choosing an appropriate string matching algorithm for current applications and addressing challenges is difficult. Understanding different string matching approaches (such as exact string matching and approximate string matching algorithms), integrating several algorithms, and modifying algorithms to address related issues are also difficult. This paper presents a survey on single-pattern exact string matching algorithms. The main purpose of this survey is to propose new classification, identify new directions and highlight the possible challenges, current trends, and future works in the area of string matching algorithms with a core focus on exact string matching algorithms. © 2013 IEEE

    Hardware-Aware Algorithm Designs for Efficient Parallel and Distributed Processing

    Get PDF
    The introduction and widespread adoption of the Internet of Things, together with emerging new industrial applications, bring new requirements in data processing. Specifically, the need for timely processing of data that arrives at high rates creates a challenge for the traditional cloud computing paradigm, where data collected at various sources is sent to the cloud for processing. As an approach to this challenge, processing algorithms and infrastructure are distributed from the cloud to multiple tiers of computing, closer to the sources of data. This creates a wide range of devices for algorithms to be deployed on and software designs to adapt to.In this thesis, we investigate how hardware-aware algorithm designs on a variety of platforms lead to algorithm implementations that efficiently utilize the underlying resources. We design, implement and evaluate new techniques for representative applications that involve the whole spectrum of devices, from resource-constrained sensors in the field, to highly parallel servers. At each tier of processing capability, we identify key architectural features that are relevant for applications and propose designs that make use of these features to achieve high-rate, timely and energy-efficient processing.In the first part of the thesis, we focus on high-end servers and utilize two main approaches to achieve high throughput processing: vectorization and thread parallelism. We employ vectorization for the case of pattern matching algorithms used in security applications. We show that re-thinking the design of algorithms to better utilize the resources available in the platforms they are deployed on, such as vector processing units, can bring significant speedups in processing throughout. We then show how thread-aware data distribution and proper inter-thread synchronization allow scalability, especially for the problem of high-rate network traffic monitoring. We design a parallelization scheme for sketch-based algorithms that summarize traffic information, which allows them to handle incoming data at high rates and be able to answer queries on that data efficiently, without overheads.In the second part of the thesis, we target the intermediate tier of computing devices and focus on the typical examples of hardware that is found there. We show how single-board computers with embedded accelerators can be used to handle the computationally heavy part of applications and showcase it specifically for pattern matching for security-related processing. We further identify key hardware features that affect the performance of pattern matching algorithms on such devices, present a co-evaluation framework to compare algorithms, and design a new algorithm that efficiently utilizes the hardware features.In the last part of the thesis, we shift the focus to the low-power, resource-constrained tier of processing devices. We target wireless sensor networks and study distributed data processing algorithms where the processing happens on the same devices that generate the data. Specifically, we focus on a continuous monitoring algorithm (geometric monitoring) that aims to minimize communication between nodes. By deploying that algorithm in action, under realistic environments, we demonstrate that the interplay between the network protocol and the application plays an important role in this layer of devices. Based on that observation, we co-design a continuous monitoring application with a modern network stack and augment it further with an in-network aggregation technique. In this way, we show that awareness of the underlying network stack is important to realize the full potential of the continuous monitoring algorithm.The techniques and solutions presented in this thesis contribute to better utilization of hardware characteristics, across a wide spectrum of platforms. We employ these techniques on problems that are representative examples of current and upcoming applications and contribute with an outlook of emerging possibilities that can build on the results of the thesis

    Hardware acceleration for power efficient deep packet inspection

    Get PDF
    The rapid growth of the Internet leads to a massive spread of malicious attacks like viruses and malwares, making the safety of online activity a major concern. The use of Network Intrusion Detection Systems (NIDS) is an effective method to safeguard the Internet. One key procedure in NIDS is Deep Packet Inspection (DPI). DPI can examine the contents of a packet and take actions on the packets based on predefined rules. In this thesis, DPI is mainly discussed in the context of security applications. However, DPI can also be used for bandwidth management and network surveillance. DPI inspects the whole packet payload, and due to this and the complexity of the inspection rules, DPI algorithms consume significant amounts of resources including time, memory and energy. The aim of this thesis is to design hardware accelerated methods for memory and energy efficient high-speed DPI. The patterns in packet payloads, especially complex patterns, can be efficiently represented by regular expressions, which can be translated by the use of Deterministic Finite Automata (DFA). DFA algorithms are fast but consume very large amounts of memory with certain kinds of regular expressions. In this thesis, memory efficient algorithms are proposed based on the transition compressions of the DFAs. In this work, Bloom filters are used to implement DPI on an FPGA for hardware acceleration with the design of a parallel architecture. Furthermore, devoted at a balance of power and performance, an energy efficient adaptive Bloom filter is designed with the capability of adjusting the number of active hash functions according to current workload. In addition, a method is given for implementation on both two-stage and multi-stage platforms. Nevertheless, false positive rates still prevents the Bloom filter from extensive utilization; a cache-based counting Bloom filter is presented in this work to get rid of the false positives for fast and precise matching. Finally, in future work, in order to estimate the effect of power savings, models will be built for routers and DPI, which will also analyze the latency impact of dynamic frequency adaption to current traffic. Besides, a low power DPI system will be designed with a single or multiple DPI engines. Results and evaluation of the low power DPI model and system will be produced in future

    Sophisticated denial-of-service attack detections through integrated architectural, OS, and appplication level events monitoring

    Get PDF
    As the first step to defend against DoS attacks, Network-based Intrusion Detection System is well explored and widely used in both commercial tools and research works. Such IDS framework is built upon features extracted from the network traffic, which are application-level features, and is effective in detecting flooding-based DoS attacks. However, in a sophisticated DoS attack, where an attacker manages to bypass the network-based monitors and launch a DoS attack locally, sniffer-based methods have difficulty in differentiating attacks with normal behaviors, since the malicious connection itself behaves in the same manner of normal connections. In this work, we study a Host-based IDS framework which integrates features from architectural and operating system (OS) levels to improve performance of sophisticated DoS intrusion detection. Network traffic collected from a campus network, and real-world exploits are used to provide a realistic evaluation

    A Multi Agent System for Flow-Based Intrusion Detection Using Reputation and Evolutionary Computation

    Get PDF
    The rising sophistication of cyber threats as well as the improvement of physical computer network properties present increasing challenges to contemporary Intrusion Detection (ID) techniques. To respond to these challenges, a multi agent system (MAS) coupled with flow-based ID techniques may effectively complement traditional ID systems. This paper develops: 1) a scalable software architecture for a new, self-organized, multi agent, flow-based ID system; and 2) a network simulation environment suitable for evaluating implementations of this MAS architecture and for other research purposes. Self-organization is achieved via 1) a reputation system that influences agent mobility in the search for effective vantage points in the network; and 2) multi objective evolutionary algorithms that seek effective operational parameter values. This paper illustrates, through quantitative and qualitative evaluation, 1) the conditions for which the reputation system provides a significant benefit; and 2) essential functionality of a complex network simulation environment supporting a broad range of malicious activity scenarios. These results establish an optimistic outlook for further research in flow-based multi agent systems for ID in computer networks

    Algorithms and Architectures for Network Search Processors

    Get PDF
    The continuous growth in the Internet’s size, the amount of data traffic, and the complexity of processing this traffic gives rise to new challenges in building high-performance network devices. One of the most fundamental tasks performed by these devices is searching the network data for predefined keys. Address lookup, packet classification, and deep packet inspection are some of the operations which involve table lookups and searching. These operations are typically part of the packet forwarding mechanism, and can create a performance bottleneck. Therefore, fast and resource efficient algorithms are required. One of the most commonly used techniques for such searching operations is the Ternary Content Addressable Memory (TCAM). While TCAM can offer very fast search speeds, it is costly and consumes a large amount of power. Hence, designing cost-effective, power-efficient, and high-speed search techniques has received a great deal of attention in the research and industrial community. In this thesis, we propose a generic search technique based on Bloom filters. A Bloom filter is a randomized data structure used to represent a set of bit-strings compactly and support set membership queries. We demonstrate techniques to convert the search process into table lookups. The resulting table data structures are kept in the off-chip memory and their Bloom filter representations are kept in the on-chip memory. An item needs to be looked up in the off-chip table only when it is found in the on-chip Bloom filters. By filtering the off-chip memory accesses in this fashion, the search operations can be significantly accelerated. Our approach involves a unique combination of algorithmic and architectural techniques that outperform some of the current techniques in terms of cost-effectiveness, speed, and power-efficiency

    MFIRE-2: A Multi Agent System for Flow-based Intrusion Detection Using Stochastic Search

    Get PDF
    Detecting attacks targeted against military and commercial computer networks is a crucial element in the domain of cyberwarfare. The traditional method of signature-based intrusion detection is a primary mechanism to alert administrators to malicious activity. However, signature-based methods are not capable of detecting new or novel attacks. This research continues the development of a novel simulated, multiagent, flow-based intrusion detection system called MFIRE. Agents in the network are trained to recognize common attacks, and they share data with other agents to improve the overall effectiveness of the system. A Support Vector Machine (SVM) is the primary classifier with which agents determine an attack is occurring. Agents are prompted to move to different locations within the network to find better vantage points, and two methods for achieving this are developed. One uses a centralized reputation-based model, and the other uses a decentralized model optimized with stochastic search. The latter is tested for basic functionality. The reputation model is extensively tested in two configurations and results show that it is significantly superior to a system with non-moving agents. The resulting system, MFIRE-2, demonstrates exciting new network defense capabilities, and should be considered for implementation in future cyberwarfare applications

    Behavioral modeling for anomaly detection in industrial control systems

    Get PDF
    In 1990s, industry demanded the interconnection of corporate and production networks. Thus, Industrial Control Systems (ICSs) evolved from 1970s proprietary and close hardware and software to nowadays Commercial Off-The-Shelf (COTS) devices. Although this transformation carries several advantages, such as simplicity and cost-efficiency, the use of COTS hardware and software implies multiple Information Technology vulnerabilities. Specially tailored worms like Stuxnet, Duqu, Night Dragon or Flame showed their potential to damage and get information about ICSs. Anomaly Detection Systems (ADSs), are considered suitable security mechanisms for ICSs due to the repetitiveness and static architecture of industrial processes. ADSs base their operation in behavioral models that require attack-free training data or an extensive description of the process for their creation. This thesis work proposes a new approach to analyze binary industrial protocols payloads and automatically generate behavioral models synthesized in rules. In the same way, through this work we develop a method to generate realistic network traffic in laboratory conditions without the need for a real ICS installation. This contribution establishes the basis of future ADS as well as it could support experimentation through the recreation of realistic traffic in simulated environments. Furthermore, a new approach to correct delay and jitter issues is proposed. This proposal improves the quality of time-based ADSs by reducing the false positive rate. We experimentally validate the proposed approaches with several statistical methods, ADSs quality measures and comparing the results with traffic taken from a real installation. We show that a payload-based ADS is possible without needing to understand the payload data, that the generation of realistic network traffic in laboratory conditions is feasible and that delay and jitter correction improves the quality of behavioral models. As a conclusion, the presented approaches provide both, an ADS able to work with private industrial protocols, together with a method to create behavioral models for open ICS protocols which does not requite training data.90. hamarkadan industriak sare korporatibo eta industrialen arteko konexioa eskatu zuen. Horrela, Kontrol Sistema Industrialak (KSI) 70. hamarkadako hardware eta software jabedun eta itxitik gaur eguneko gailu estandarretara egin zuten salto. Eraldaketa honek hainbat onura ekarri baditu ere, era berean gailu estandarren erabilerak hainbat Informazio Teknologietako (IT) zaurkortasun ekarri ditu. Espezialki diseinatutako zizareek, Stuxnet, Duque, Night Dragon eta Flame esaterako, ondorio latzak gauzatu eta informazioa lapurtzean beraien potentzia erakutsi dute. Anomalia Detekzio Sistemak (ADS) KSI-etako segurtasun mekanismo egoki bezala kontsideraturik daude, azken hauen errepikakortasun eta arkitektura estatikoa dela eta. ADS-ak erasorik gabeko datu garbietan ikasitako edo prozesuen deskripzio sakona behar duten jarrera modeloetan oinarritzen dira. Tesi honek protokolo industrial binarioak aztertu eta automatikoki jarrera modeloak sortu eta erregeletan sintetizatzen dituen ikuspegia proposatzen du. Era berean lan honen bidez laborategi kondizioetan sare trafiko errealista sortzeko metodo bat aurkezten da, KSI-rik behar ez duena. Ekarpen honek etorkizuneko ADS baten oinarriak finkatzen ditu, baita esperimentazioa bultzatu ere simulazio inguruneetan sare trafiko errealista sortuz. Gainera, atzerapen eta sortasun arazoak hobetzen dituen ekarpen berri bat egiten da. Ekarpen honek denboran oinarritutako ADS-en kalitatea hobetzen du, positibo faltsuen ratioa jaitsiz. Esperimentazio bidez ekarpen ezberdinak balioztatu dira, hainbat metodo estatistiko, ADS-en kalitate neurri eta trafiko erreal eta simulatuak alderatuz. Datu erabilgarriak ulertzeko beharrik gabeko ADS-ak posible direla demostratu dugu, trafiko errealista laborategi kondizioetan sortzea posible dela eta atzerapen eta sortasunaren zuzenketak jarrera modeloen kalitatea hobetzen dutela. Ondorio bezala, protokolo industrial pribatuekin lan egiteko ADS bat eta jarrera modeloa sortzeko entrenamendu daturik behar ez duen eta KSI-en protokolo irekiekin lan egiteko gai den metodoa aurkeztu dira.En los años 90, la industria proclamó la interconexión de las redes corporativas y los de producción. Así, los Sistemas de Control Industrial (SCI) evolucionaron desde el hardware y software propietario de los 70 hasta los dispositivos comunes de hoy en día. Incluso si esta adopción implicó diversas ventajas, como el uso de hardware y software comunes, conlleva múltiples vulnerabilidades. Gusanos especialmente desarrollados como Stuxnet, Duqu, Night Dragon y Flame mostraron su potencial para causar daños y obtener información. Los Sistemas de Detección de Anomalías (SDA) están considerados como mecanismos de seguridad apropiados para los SCI debido a la repetitividad y la arquitectura estática de los procesos industriales. Los SDA basan su operación en modelos de comportamiento que requieren datos libres de ataque o extensas descripciones de proceso para su creación. Esta tesis propone un nuevo enfoque para el análisis de los datos de la carga útil del tráfico de protocolos industriales binarios y la generación automática de modelos de comportamiento sintetizados en reglas. Así mismo, mediante este trabajo se ha desarrollado un método para generar tráfico de red realista en condiciones de laboratorio sin la necesidad de instalaciones SCI reales. Esta contribución establece las bases de un futuro SDA así como el respaldo a la experimentación mediante la recreación de tráfico realista en entornos simulados. Además, se ha propuesto un nuevo enfoque para la corrección de retraso y latencia. Esta propuesta mejora la calidad del SDA basados en tiempo reduciendo el ratio de falsos positivos. Mediante la experimentación se han validado los enfoques propuestos utilizando algunos métodos estadísticos, medidas de calidad de SDA y comparando los resultados con tráfico obtenido a partir de instalaciones reales. Se ha demostrado que son posibles los SDA basados en carga útil sin la necesidad de entender el contenido de la carga, que la generación de tráfico realista en condiciones de laboratorio es posible y que la corrección del retraso y la latencia mejoran la calidad de los modelos de comportamiento. Como conclusión, las propuestas presentadas proporcionan un SDA capaz de trabajar con protocolos privados de control industrial a la vez que un método para la creación de modelos de comportamiento para SCI sin la necesidad de datos de entrenamiento
    corecore