3,780 research outputs found
A Direct Translation from XPath to Nondeterministic Automata
Abstract. Since navigational aspects of XPath correspond to first-order definability, it has been proposed to use the analogy with the very successful technique of translating LTL into automata, and produce efficient translations of XPath queries into automata on unranked trees. These translations can then be used for a variety of reasoning tasks such as XPath consistency, or optimization, under XML schema constraints. In the verification scenarios, translations into both nondeterministic and alternating automata are used. But while a direct translation from XPath into alternating automata is known, only an indirect translation into nondeterministic automata- going via intermediate logics- exists. A direct translation is desirable as most XML specifications have particularly nice translations into nondeterministic automata and it is natural to use such automata to reason about XPath and schemas. The goal of the paper is to produce such a direct translation of XPath into nondeterministic automata.
Rewrite based Verification of XML Updates
We consider problems of access control for update of XML documents. In the
context of XML programming, types can be viewed as hedge automata, and static
type checking amounts to verify that a program always converts valid source
documents into also valid output documents. Given a set of update operations we
are particularly interested by checking safety properties such as preservation
of document types along any sequence of updates. We are also interested by the
related policy consistency problem, that is detecting whether a sequence of
authorized operations can simulate a forbidden one. We reduce these questions
to type checking problems, solved by computing variants of hedge automata
characterizing the set of ancestors and descendants of the initial document
type for the closure of parameterized rewrite rules
Reasoning about XML with temporal logics and automata
We show that problems arising in static analysis of XML specifications and transformations can be dealt with using techniques similar to those developed for static analysis of programs. Many properties of interest in the XML context are related to navigation, and can be formulated in temporal logics for trees. We choose a logic that admits a simple single-exponential translation into unranked tree automata, in the spirit of the classical LTL-to-BĂŒchi automata translation. Automata arising from this translation have a number of additional properties; in particular, they are convenient for reasoning about unary node-selecting queries, which are important in the XML context. We give two applications of such reasoning: one deals with a classical XML problem of reasoning about navigation in the presence of schemas, and the other relates to verifying security properties of XML views
XQuery Streaming by Forest Transducers
Streaming of XML transformations is a challenging task and only very few
systems support streaming. Research approaches generally define custom
fragments of XQuery and XPath that are amenable to streaming, and then design
custom algorithms for each fragment. These languages have several shortcomings.
Here we take a more principles approach to the problem of streaming
XQuery-based transformations. We start with an elegant transducer model for
which many static analysis problems are well-understood: the Macro Forest
Transducer (MFT). We show that a large fragment of XQuery can be translated
into MFTs --- indeed, a fragment of XQuery, that can express important features
that are missing from other XQuery stream engines, such as GCX: our fragment of
XQuery supports XPath predicates and let-statements. We then rely on a
streaming execution engine for MFTs, one which uses a well-founded set of
optimizations from functional programming, such as strictness analysis and
deforestation. Our prototype achieves time and memory efficiency comparable to
the fastest known engine for XQuery streaming, GCX. This is surprising because
our engine relies on the OCaml built in garbage collector and does not use any
specialized buffer management, while GCX's efficiency is due to clever and
explicit buffer management.Comment: Full version of the paper in the Proceedings of the 30th IEEE
International Conference on Data Engineering (ICDE 2014
XML Compression via DAGs
Unranked trees can be represented using their minimal dag (directed acyclic
graph). For XML this achieves high compression ratios due to their repetitive
mark up. Unranked trees are often represented through first child/next sibling
(fcns) encoded binary trees. We study the difference in size (= number of
edges) of minimal dag versus minimal dag of the fcns encoded binary tree. One
main finding is that the size of the dag of the binary tree can never be
smaller than the square root of the size of the minimal dag, and that there are
examples that match this bound. We introduce a new combined structure, the
hybrid dag, which is guaranteed to be smaller than (or equal in size to) both
dags. Interestingly, we find through experiments that last child/previous
sibling encodings are much better for XML compression via dags, than fcns
encodings. We determine the average sizes of unranked and binary dags over a
given set of labels (under uniform distribution) in terms of their exact
generating functions, and in terms of their asymptotical behavior.Comment: A short version of this paper appeared in the Proceedings of ICDT
201
A decidable policy language for history-based transaction monitoring
Online trading invariably involves dealings between strangers, so it is
important for one party to be able to judge objectively the trustworthiness of
the other. In such a setting, the decision to trust a user may sensibly be
based on that user's past behaviour. We introduce a specification language
based on linear temporal logic for expressing a policy for categorising the
behaviour patterns of a user depending on its transaction history. We also
present an algorithm for checking whether the transaction history obeys the
stated policy. To be useful in a real setting, such a language should allow one
to express realistic policies which may involve parameter quantification and
quantitative or statistical patterns. We introduce several extensions of linear
temporal logic to cater for such needs: a restricted form of universal and
existential quantification; arbitrary computable functions and relations in the
term language; and a "counting" quantifier for counting how many times a
formula holds in the past. We then show that model checking a transaction
history against a policy, which we call the history-based transaction
monitoring problem, is PSPACE-complete in the size of the policy formula and
the length of the history. The problem becomes decidable in polynomial time
when the policies are fixed. We also consider the problem of transaction
monitoring in the case where not all the parameters of actions are observable.
We formulate two such "partial observability" monitoring problems, and show
their decidability under certain restrictions
Rewrite Closure and CF Hedge Automata
We introduce an extension of hedge automata called bidimensional context-free
hedge automata. The class of unranked ordered tree languages they recognize is
shown to be preserved by rewrite closure with inverse-monadic rules. We also
extend the parameterized rewriting rules used for modeling the W3C XQuery
Update Facility in previous works, by the possibility to insert a new parent
node above a given node. We show that the rewrite closure of hedge automata
languages with these extended rewriting systems are context-free hedge
languages
- âŠ