33 research outputs found
On the Average Size of Glushkov's Automata
12 pagesInternational audienceGlushkov's algorithm builds an epsilon-free nondeterministic automaton from a given regular expression. In the worst case, its number of states is linear and its number of transitions is quadratic in the size of the expression. We show in this paper that in average, the number of transitions is linear
Parameterized Regular Expressions and their Languages
We study regular expressions that use variables, or parameters, which are
interpreted as alphabet letters. We consider two classes of languages denoted
by such expressions: under the possibility semantics, a word belongs to the
language if it is denoted by some regular expression obtained by replacing
variables with letters; under the certainly semantics, the word must be denoted
by every such expression. Such languages are regular, and we show that they
naturally arise in several applications such as querying graph databases and
program analysis. As the main contribution of the paper, we provide a complete
characterization of the complexity of the main computational problems related
to such languages: nonemptiness, universality, containment, membership, as well
as the problem of constructing NFAs capturing such languages. We also look at
the extension when domains of variables could be arbitrary regular languages,
and show that under the certainty semantics, languages remain regular and the
complexity of the main computational problems does not change
Path-equivalent developments in acyclic weighted automata
International audienceWeighted finite automata (WFA) are used with FPGA accelerating hardware to scan large genomic banks. Hardwiring such automata raises surface area and clock frequency constraints, requiring efficient ε-transitions-removal techniques. In this paper, we present bounds on the number of new transitions for the development of acyclic WFA, which is a special case of the ε-transitions-removal problem. We introduce a new problem, a partial removal of ε-transitions while accepting short chains of ε-transitions
Efficient Testing and Matching of Deterministic Regular Expressions
International audienc
Zombie: Middleboxes that Don’t Snoop
Zero-knowledge middleboxes (ZKMBs) are a recent paradigm in which clients get privacy while middleboxes enforce policy: clients prove in zero knowledge that the plaintext underlying their encrypted traffic complies with network policies, such as DNS filtering. However, prior work had impractically poor performance and was limited in functionality.
This work presents Zombie, the first system built using the ZKMB paradigm. Zombie introduces techniques that push ZKMBs to the verge of practicality: preprocessing (to move the bulk of proof generation to idle times between requests), asynchrony (to remove proving and verifying costs from the critical path), and batching (to amortize some of the verification work). Zombie’s choices, together with these techniques, provide a factor of 3.5 speedup in total computation done by client and middlebox, lowering the critical path overhead for a DNS filtering application to less than 300ms (on commodity hardware) or (in the asynchronous configuration) to 0.
As an additional contribution that is likely of independent interest, Zombie introduces a portfolio of techniques to efficiently encode regular expressions in probabilistic (and zero knowledge) proofs; these techniques offer significant asymptotic and constant factor improvements in performance over a standard baseline. Zombie builds on this portfolio to support policies based on regular expressions, such as data loss prevention