Skip to main content
Article thumbnail
Location of Repository

Regular expression matching with input compression and next state prediction.

By Gerald Tripp


Automata based regular expression matching can often require large amounts of memory for its state transition tables, particularly when matching multiple complex regular expressions with the same automata. For systems with limited memory resources it is common to try to compress the state transition tables. One technique called row displacement with state marking does this by identifying default values for the next state and then packing the remaining information into a one dimensional array. Although this compression technique works well when matching multiple strings, it is not as effective when matching multiple complex regular expressions. This paper describes a technique called next state prediction. This performs lossy compression of the current state and input values and uses these to select a likely next state from a prediction table. This is used in conjunction with a standard row displacement with state marking algorithm and leads to an overall reduction in the memory required for the various tables. The algorithms have been tested with a number of different design parameters, and compared with a 'baseline version' where this technique is not used. When testing this system with a set of regular expressions from the Snort intrusion detection system, the memory required was around 46% of that required for the baseline version. The design has been modelled in VHDL for use within an FPGA and tested via simulation and operates at a search rate of 2.0 Gbps irrespective of the regular expressions being searched for or the input data being scanned

Topics: QA76
Publisher: UKC
Year: 2008
OAI identifier:

Suggested articles


  1. (2006). A scaleable architecture for high-throughput regular-expression pattern matching. In: doi
  2. (2003). Efficient reconfigurable logic circuits for matching complex network - 13 -intrusion detection patterns. In: doi
  3. (1975). Efficient string matching: an aid to bibliographic search, doi
  4. (2001). Fast regular expression matching using FPGAs. In:
  5. (2001). Introduction to automata theory, languages and computation, 2nd ed. doi
  6. (2004). Over 10 Gbps String Matching Mechanism for Multi-stream Packet Scanning Systems, In: doi
  7. (2004). Partial character decoding for improved regular expression matching in FPGAs. In: doi
  8. (2006). PCRE - Perl-compatible regular expressions,
  9. (2007). Regular expression matching with input compression: a hardware design for use within network intrusion detection systems. doi
  10. (2004). Scalable multi-pattern matching on high-speed networks. In: doi
  11. (1999). Snort - Lightweight Intrusion Detection for Networks. doi
  12. (2006). Xilinx Virtex-4 Family Overview, DS112 v1.6, Preliminary Product Specification.

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.