5 research outputs found

    Certain Query Answering on Compressed String Patterns: From Streams to Hyperstreams

    Get PDF
    International audienceWe study the problem of certain query answering (CQA) on compressed string patterns. These are incomplete singleton context-free grammars, that can model systems of multiple streams with references to others, called hyperstreams more recently. In order to capture regular path queries on strings, we consider nondeterministic finite automata (NFAs) for query definition. It turns out that CQA for Boolean NFA queries is equivalent to regular string pattern inclusion, i.e., whether all strings completing a compressed string pattern belong to a regular language. We prove that CQA on compressed string patterns is PSpace- complete for NFA queries. The PSpace-hardness even applies to Boolean queries defined by deterministic finite automata (DFAs) and without compression. We also show that CQA on compressed linear string patterns can be solved in PTime for DFA queries. The proofs of the results presented here can be found in the long version of this paper (https://hal.inria.fr/hal-01846016)

    Nested Regular Expressions can be Compiled to Small Deterministic Nested Word Automata

    Get PDF
    International audienceWe study the problem of whether regular expressions for nested words can be compiled to small deterministic nested word au-tomata (NWAs). In theory, we obtain a positive answer for small deter-ministic regular expressions for nested words. In practice of navigational path queries, nondeterministic NWAs are obtained for which NWA de-terminization explodes. We show that practical good solutions can be obtained by using stepwise hedge automata as intermediates

    Certain Query Answering on Compressed String Patterns: From Streams to Hyperstreams (long version)

    Get PDF
    We study the problem of certain query answering (CQA) on compressed string patterns. These are incomplete singleton context-free grammars, that can model systems of multiple streams with references to others, called hyperstreams more recently. In order to capture regular path queries on strings, we consider nondeterministic finite automata (NFAs) for query definition. It turns out that CQA for Boolean NFA queries is equivalent to regular string pattern inclusion, i.e., whether all strings completing a compressed string pattern belong to a regular language. We prove that CQA on compressed string patterns is PSPACE-complete for NFA queries. The PSPACE-hardness even applies to Boolean queries defined by deterministic finite automata (DFAs) and without compression. We also show that CQA on compressed linear string patterns can be solved in PTIME for DFA queries

    Determinization and Minimization of Automata for Nested Words Revisited

    Get PDF
    International audienceWe consider the problem of determinizing and minimizing automata for nested words in practice. For this we compile the nested regular expressions (NREsNRE_s) from the usual XPath benchmark to nested word automata (NWNWAsA_s). The determinization of these NWNW AsA_s, however, fails to produce reasonably small automata. In the best case, huge deterministic NWNWAsA_s are produced after few hours, even for relatively small NREsNRE_s of the benchmark. We propose a different approach to the determinization of automata for nested words. For this, we introduce stepwise hedge automata (SHAsSHA_s) that generalize naturally on both (stepwise) tree automata and on finite word automata. We then show how to determinize SHAsSHA_s, yielding reasonably small deterministic automata for the NREsNRE_s from the XPath benchmark. The size of deterministic SHAsSHA_s automata can be reduced further by a novel minimization algorithm for a subclass of SHAsSHA_s. In order to understand why the new approach to determinization and minimization works so nicely, we investigate the relationship between NWAsNWA_s and SHAsSHA_s further. Clearly, deterministic SHAsSHA_s can be compiled to deterministic NWAs in linear time, and conversely, NWNWAsA_s can be compiled to nondeterministic SHAsSHA_s in polynomial time. Therefore, we can use SHAsSHA_s as intermediates for determinizing NWAsNWA_s, while avoiding the huge size increase with the usual determinization algorithm for NWAsNWA_s. Notably, the NWAs obtained from the SHAsSHA_s perform bottom-up and left-to-right computations only, but no top-down computations. This NWANWA-behavior can be distinguished syntactically by the (weak) single-entry property, suggesting a close relationship between SHAsSHA_s and single-entry NWAsNWA_s. In particular, it turns out that the usual determinization algorithm for NWAsNWA_s behaves well for single-entry NWAsNWA_s, while it quickly explodes without the single-entry property. Furthermore, it is known that the class of deterministic multi-module single-entry NWAsNWA_s enjoys unique minimization. The subclass of deterministic SHAsSHA_s to which our novel minimization algorithm applies is different though, in that we do not impose multiple modules. As further optimizations for reducing the sizes of the constructed SHAsSHA_s, we propose schema-based cleaning and symbolic representations based on apply-else rules, that can be maintained by determinization. We implemented the optimizations and report the experimental results for the automata constructed for the XPathMark benchmark

    Stream Firewalling of XML Constraints

    No full text
    As XML-based messages have become common in many client-server protocols, there is a need to protect application servers from invalid or dangerous messages. This leads to the XML stream firewalling problem; that of applying integrity constraints against a large number of simultaneous streams. We conduct the first investigation of a constraint engine optimized for the generation of XML stream firewalls. We isolate a class of DTDs and XPath constraints which support the generation of low-space filters, and provide algorithms for generating firewalls with low per-input-character time and per-stream space. We give experimental results which show that we have achieved these goals in practice
    corecore