Search CORE

18 research outputs found

XQuery Streaming by Forest Transducers

Author: Hakuta Shizuya
Iwasaki Hideya
Maneth Sebastian
Nakano Keisuke
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 04/12/2013
Field of study

Streaming of XML transformations is a challenging task and only very few systems support streaming. Research approaches generally define custom fragments of XQuery and XPath that are amenable to streaming, and then design custom algorithms for each fragment. These languages have several shortcomings. Here we take a more principles approach to the problem of streaming XQuery-based transformations. We start with an elegant transducer model for which many static analysis problems are well-understood: the Macro Forest Transducer (MFT). We show that a large fragment of XQuery can be translated into MFTs --- indeed, a fragment of XQuery, that can express important features that are missing from other XQuery stream engines, such as GCX: our fragment of XQuery supports XPath predicates and let-statements. We then rely on a streaming execution engine for MFTs, one which uses a well-founded set of optimizations from functional programming, such as strictness analysis and deforestation. Our prototype achieves time and memory efficiency comparable to the fastest known engine for XQuery streaming, GCX. This is surprising because our engine relies on the OCaml built in garbage collector and does not use any specialized buffer management, while GCX's efficiency is due to clever and explicit buffer management.Comment: Full version of the paper in the Proceedings of the 30th IEEE International Conference on Data Engineering (ICDE 2014

arXiv.org e-Print Archive

CiteSeerX

Transforming XML Streams with References

Author: Maneth Sebastian
Ordóñez Alberto
Seidl Helmut
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/09/2015
Field of study

Edinburgh Research Explorer

Bounded Delay and Concurrency for Earliest Query Answering

Author: A. Berlea
A. Neumann
A. Weber
C. Allauzen
D. Olteanu
H. Seidl
J. Carme
J. Carme
M. Benedikt
O. Gauwin
R.E. Stearns
W. Martens
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

International audienceEarliest query answering is needed for streaming XML processing with optimal memory management. We study the feasibility of earliest query answering for node selection queries. Tractable queries are distinguished by a bounded number of concurrently alive answer candidates at every time point, and a bounded delay for node selection. We show that both properties are decidable in polynomial time for queries defined by deterministic automata for unranked trees. Our results are obtained by reduction to the bounded valuedness problem for recognizable relations between unranked trees

HAL - Lille 3

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Evaluation of XPath Queries against XML Streams

Author: Olteanu Dan
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 11/02/2005
Field of study

XML is nowadays the de facto standard for electronic data interchange on the Web. Available XML data ranges from small Web pages to ever-growing repositories of, e.g., biological and astronomical data, and even to rapidly changing and possibly unbounded streams, as used in Web data integration and publish-subscribe systems. Animated by the ubiquity of XML data, the basic task of XML querying is becoming of great theoretical and practical importance. The last years witnessed efforts as well from practitioners, as also from theoreticians towards defining an appropriate XML query language. At the core of this common effort has been identified a navigational approach for information localization in XML data, comprised in a practical and simple query language called XPath. This work brings together the two aforementioned ``worlds'', i.e., the XPath query evaluation and the XML data streams, and shows as well theoretical as also practical relevance of this fusion. Its relevance can not be subsumed by traditional database management systems, because the latter are not designed for rapid and continuous loading of individual data items, and do not directly support the continuous queries that are typical for stream applications. The first central contribution of this work consists in the definition and the theoretical investigation of three term rewriting systems to rewrite queries with reverse predicates, like parent or ancestor, into equivalent forward queries, i.e., queries without reverse predicates. Our rewriting approach is vital to the evaluation of queries with reverse predicates against unbounded XML streams, because neither the storage of past fragments of the stream, nor several stream traversals, as required by the evaluation of reverse predicates, are affordable. Beyond their declared main purpose of providing equivalences between queries with reverse predicates and forward queries, the applications of our rewriting systems shed light on other query language properties, like the expressivity of some of its fragments, the query minimization, or even the complexity of query evaluation. For example, using these systems, one can rewrite any graph query into an equivalent forward forest query. The second main contribution consists in a streamed and progressive evaluation strategy of forward queries against XML streams. The evaluation is specified using compositions of so-called stream processing functions, and is implemented using networks of deterministic pushdown transducers. The complexity of this evaluation strategy is polynomial in both the query and the data sizes for forward forest queries and even for a large fragment of graph queries. The third central contribution consists in two real monitoring applications that use directly the results of this work: the monitoring of processes running on UNIX computers, and a system for providing graphically real-time traffic and travel information, as broadcasted within ubiquitous radio signals

Digitale Hochschulschriften der LMU

Programming Using Automata and Transducers

Author: D\u27antoni Loris
Publication venue: ScholarlyCommons
Publication date: 01/01/2015
Field of study

Automata, the simplest model of computation, have proven to be an effective tool in reasoning about programs that operate over strings. Transducers augment automata to produce outputs and have been used to model string and tree transformations such as natural language translations. The success of these models is primarily due to their closure properties and decidable procedures, but good properties come at the price of limited expressiveness. Concretely, most models only support finite alphabets and can only represent small classes of languages and transformations. We focus on addressing these limitations and bridge the gap between the theory of automata and transducers and complex real-world applications: Can we extend automata and transducer models to operate over structured and infinite alphabets? Can we design languages that hide the complexity of these formalisms? Can we define executable models that can process the input efficiently? First, we introduce succinct models of transducers that can operate over large alphabets and design BEX, a language for analysing string coders. We use BEX to prove the correctness of UTF and BASE64 encoders and decoders. Next, we develop a theory of tree transducers over infinite alphabets and design FAST, a language for analysing tree-manipulating programs. We use FAST to detect vulnerabilities in HTML sanitizers, check whether augmented reality taggers conflict, and optimize and analyze functional programs that operate over lists and trees. Finally, we focus on laying the foundations of stream processing of hierarchical data such as XML files and program traces. We introduce two new efficient and executable models that can process the input in a left-to-right linear pass: symbolic visibly pushdown automata and streaming tree transducers. Symbolic visibly pushdown automata are closed under Boolean operations and can specify and efficiently monitor complex properties for hierarchical structures over infinite alphabets. Streaming tree transducers can express and efficiently process complex XML transformations while enjoying decidable procedures

CiteSeerX

ScholarlyCommons@Penn

Equivalence Problems for Tree Transducers: A Brief Survey

Author: Aho
Aho
Alur
Alur
Andre
Andre
Benedikt
Berstel
Bozapalidis
Caralp
Courcelle
Courcelle
Courcelle
Courcelle
Culik II
Culik II
Culik II
Engelfriet
Engelfriet
Engelfriet
Engelfriet
Engelfriet
Engelfriet
Engelfriet
Engelfriet
Engelfriet
Engelfriet
Engelfriet
Esparza
Filiot
Filiot
Fischer
Friese
Friese
Fülöp
Ginsburg
Ginsburg
Griffiths
Gurari
Hakuta
Honkala
Hopcroft
Karhumäki
Knuth
Lemay
Maneth
Maneth
Maneth
Milo
Nakano
Parikh
Perst
Plandowski
Raskin
Rounds
Rounds
Ruohonen
Sebastian Maneth
Seidl
Seidl
Seidl
Seidl
Servais
Staworko
Thatcher
Vogler
Voigtländer
Zachar
Zoltán Fülöp
Zoltán Ésik
Ésik
Publication venue: 'Open Publishing Association'
Publication date: 01/01/2014
Field of study

The decidability of equivalence for three important classes of tree transducers is discussed. Each class can be obtained as a natural restriction of deterministic macro tree transducers (MTTs): (1) no context parameters, i.e., top-down tree transducers, (2) linear size increase, i.e., MSO definable tree transducers, and (3) monadic input and output ranked alphabets. For the full class of MTTs, decidability of equivalence remains a long-standing open problem.Comment: In Proceedings AFL 2014, arXiv:1405.527

arXiv.org e-Print Archive

CiteSeerX

Crossref

Directory of Open Access Journals

Edinburgh Research Explorer

A Survey on Decidable Equivalence Problems for Tree Transducers

Author: Esparza J.
Fülöp Z.
Ginsburg S.
Sebastian Maneth
Zachar Z.
Ésik Z.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/12/2015
Field of study

Crossref

Edinburgh Research Explorer