'Institute of Electrical and Electronics Engineers (IEEE)'
Abstract
International audienceRegular expression (RegEx) matching is a core function of deep packet inspection in modern network devices. Previous TCAM-based RegEx matching algorithms a priori assume that a deterministic finite automaton (DFA) can be built for a given set of RegEx patterns. However, practical RegEx patterns contain complex terms like wildcard closure and repeat character, and it may be impossible to build a DFA with a reasonable number of states. This results in prior work to being infeasible in practice. Moreover, TCAM-based RegEx matching is required to scale to a large-scale set of RegEx patterns. In this paper, we propose a compressed finite automaton implementation called (CFA) for scalable TCAM-based RegEx matching. CFA is designed to reduce TCAM space by using three compression techniques: transition, character, and state compressions. Experiments on realistic RegEx pattern sets show CFA highly outperforms previous solutions in terms of TCAM space, matching throughput, and TCAM power consumption