Search CORE

16,320 research outputs found

Memory-Efficient Regular Expression Search Using State Merging

Author: Michela Becchi
Srihari Cadambi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

Abstract — Pattern matching is a crucial task in several critical network services such as intrusion detection and policy man-agement. As the complexity of rule-sets increases, traditional string matching engines are being replaced by more sophisticated regular expression engines. To keep up with line rates, deal with denial of service attacks and provide predictable resource provisioning, the design of such engines must allow examining payload traffic at several gigabits per second and provide worst case speed guarantees. While regular expression matching using deterministic finite automata (DFA) is a well studied problem in theory, its implementation either in software or specialized hardware is complicated by prohibitive memory requirements. This is especially true for DFAs representing complex regular expressions present in practical rule-sets. In this paper, we introduce a novel method to drastically reduce the DFA memory requirement and still provide worst-case speed guarantees. Specifically, we merge several “non-equivalent” states in a DFA by introducing labels on their input and output transitions. We then propose a data structure to represent the merged states and the transition labels. We show that, with very few assumptions about the original DFA, such a transformation results in significant compression in the DFA representation. We have implemented a state merging and transition labeling algorithm for DFAs, and show that for Snort and Bro security rule-sets, state merging results in memory reductions of an order of magnitude. I

CiteSeerX

Crossref

A Grammatical Inference Approach to Language-Based Anomaly Detection in XML

Author: Lampesberger Harald
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

False-positives are a problem in anomaly-based intrusion detection systems. To counter this issue, we discuss anomaly detection for the eXtensible Markup Language (XML) in a language-theoretic view. We argue that many XML-based attacks target the syntactic level, i.e. the tree structure or element content, and syntax validation of XML documents reduces the attack surface. XML offers so-called schemas for validation, but in real world, schemas are often unavailable, ignored or too general. In this work-in-progress paper we describe a grammatical inference approach to learn an automaton from example XML documents for detecting documents with anomalous syntax. We discuss properties and expressiveness of XML to understand limits of learnability. Our contributions are an XML Schema compatible lexical datatype system to abstract content in XML and an algorithm to learn visibly pushdown automata (VPA) directly from a set of examples. The proposed algorithm does not require the tree representation of XML, so it can process large documents or streams. The resulting deterministic VPA then allows stream validation of documents to recognize deviations in the underlying tree structure or datatypes.Comment: Paper accepted at First Int. Workshop on Emerging Cyberthreats and Countermeasures ECTCM 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Logical Algorithms meets CHR: A meta-complexity result for Constraint Handling Rules with rule priorities

Author: De Koninck Leslie
Publication venue
Publication date: 01/01/2009
Field of study

This paper investigates the relationship between the Logical Algorithms language (LA) of Ganzinger and McAllester and Constraint Handling Rules (CHR). We present a translation schema from LA to CHR-rp: CHR with rule priorities, and show that the meta-complexity theorem for LA can be applied to a subset of CHR-rp via inverse translation. Inspired by the high-level implementation proposal for Logical Algorithm by Ganzinger and McAllester and based on a new scheduling algorithm, we propose an alternative implementation for CHR-rp that gives strong complexity guarantees and results in a new and accurate meta-complexity theorem for CHR-rp. It is furthermore shown that the translation from Logical Algorithms to CHR-rp combined with the new CHR-rp implementation, satisfies the required complexity for the Logical Algorithms meta-complexity result to hold.Comment: To appear in Theory and Practice of Logic Programming (TPLP

arXiv.org e-Print Archive

Lirias

Pushdown Control-Flow Analysis of Higher-Order Programs

Author: Earl Christopher
Might Matthew
Van Horn David
Publication venue
Publication date: 24/07/2010
Field of study

Context-free approaches to static analysis gain precision over classical approaches by perfectly matching returns to call sites---a property that eliminates spurious interprocedural paths. Vardoulakis and Shivers's recent formulation of CFA2 showed that it is possible (if expensive) to apply context-free methods to higher-order languages and gain the same boost in precision achieved over first-order programs. To this young body of work on context-free analysis of higher-order programs, we contribute a pushdown control-flow analysis framework, which we derive as an abstract interpretation of a CESK machine with an unbounded stack. One instantiation of this framework marks the first polyvariant pushdown analysis of higher-order programs; another marks the first polynomial-time analysis. In the end, we arrive at a framework for control-flow analysis that can efficiently compute pushdown generalizations of classical control-flow analyses.Comment: The 2010 Workshop on Scheme and Functional Programmin

arXiv.org e-Print Archive

CiteSeerX

Efficient Multistriding of Large Non-deterministic Finite State Automata for Deep Packet Inspection

Author: Avalle Matteo Carlo
Risso Fulvio Giovanni Ottavio
Sisto Riccardo
Publication venue: IEEE
Publication date: 01/01/2012
Field of study

Multistride automata speed up input matching because each multistriding transformation halves the size of the input string, leading to a potential 2x speedup. However, up to now little effort has been spent in optimizing the building process of multistride automata, with the result that current algorithms cannot be applied to real-life, large automata such as the ones used in commercial IDSs, because the time and the memory space needed to create the new automaton quickly becomes unfeasible. In this paper, new algorithms for efficient building of multistride NFAs for packet inspection are presented, explaining how these new techniques can outperform the previous algorithms in terms of required time and memory usag

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

From Finite Automata to Regular Expressions and Back--A Summary on Descriptional Complexity

Author: Gruber Hermann
Holzer Markus
Publication venue: 'Open Publishing Association'
Publication date: 01/05/2014
Field of study

The equivalence of finite automata and regular expressions dates back to the seminal paper of Kleene on events in nerve nets and finite automata from 1956. In the present paper we tour a fragment of the literature and summarize results on upper and lower bounds on the conversion of finite automata to regular expressions and vice versa. We also briefly recall the known bounds for the removal of spontaneous transitions (epsilon-transitions) on non-epsilon-free nondeterministic devices. Moreover, we report on recent results on the average case descriptional complexity bounds for the conversion of regular expressions to finite automata and brand new developments on the state elimination algorithm that converts finite automata to regular expressions.Comment: In Proceedings AFL 2014, arXiv:1405.527

arXiv.org e-Print Archive

Directory of Open Access Journals

Matrix Code

Author: van Emden M. H.
Publication venue
Publication date: 25/02/2013
Field of study

Matrix Code gives imperative programming a mathematical semantics and heuristic power comparable in quality to functional and logic programming. A program in Matrix Code is developed incrementally from a specification in pre/post-condition form. The computations of a code matrix are characterized by powers of the matrix when it is interpreted as a transformation in a space of vectors of logical conditions. Correctness of a code matrix is expressed in terms of a fixpoint of the transformation. The abstract machine for Matrix Code is the dual-state machine, which we present as a variant of the classical finite-state machine.Comment: 39 pages, 19 figures; extensions and minor correction

arXiv.org e-Print Archive

CiteSeerX