Search CORE

26 research outputs found

Towards Practical Typechecking for Macro Tree Transducers

Author: Frisch Alain
Hosoya Haruo
Publication venue
Publication date: 01/01/2007
Field of study

Macro tree transducers (mtt) are an important model that both covers many useful XML transformations and allows decidable exact typechecking. This paper reports our first step toward an implementation of mtt typechecker that has a practical efficiency. Our approach is to represent an input type obtained from a backward inference as an alternating tree automaton, in a style similar to Tozawa's XSLT0 typechecking. In this approach, typechecking reduces to checking emptiness of an alternating tree automaton. We propose several optimizations (Cartesian factorization, state partitioning) on the backward inference process in order to produce much smaller alternating tree automata than the naive algorithm, and we present our efficient algorithm for checking emptiness of alternating tree automata, where we exploit the explicit representation of alternation for local optimizations. Our preliminary experiments confirm that our algorithm has a practical performance that can typecheck simple transformations with respect to the full XHTML in a reasonable time

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

Hal-Diderot

Frontiers of tractability for typechecking simple XML transformations

Author: Martens Wim
Neven Frank
Publication venue: Elsevier Inc.
Publication date: 31/05/2007
Field of study

AbstractTypechecking consists of statically verifying whether the output of an XML transformation is always conform to an output type for documents satisfying a given input type. We focus on complete algorithms which always produce the correct answer. We consider top–down XML transformations incorporating XPath expressions and abstract document types by grammars and tree automata. By restricting schema languages and transformations, we identify several practical settings for which typechecking can be done in polynomial time. Moreover, the resulting framework provides a rather complete picture as we show that most scenarios cannot be enlarged without rendering the typechecking problem intractable. So, the present research sheds light on when to use fast complete algorithms and when to reside to sound but incomplete ones

Elsevier - Publisher Connector

Static Analysis of Graph Database Transformations

Author: Boneva Iovka
Groz Benoit
Hidders Jan
Murlak Filip
Staworko Slawomir
Publication venue
Publication date: 11/04/2023
Field of study

We investigate graph transformations, defined using Datalog-like rules based on acyclic conjunctive two-way regular path queries (acyclic C2RPQs), and we study two fundamental static analysis problems: type checking and equivalence of transformations in the presence of graph schemas. Additionally, we investigate the problem of target schema elicitation, which aims to construct a schema that closely captures all outputs of a transformation over graphs conforming to the input schema. We show all these problems are in EXPTIME by reducing them to C2RPQ containment modulo schema; we also provide matching lower bounds. We use cycle reversing to reduce query containment to the problem of unrestricted (finite or infinite) satisfiability of C2RPQs modulo a theory expressed in a description logic

arXiv.org e-Print Archive

: Méthodes d'Inférence Symbolique pour les Bases de Données

Author: Staworko Slawomir
Publication venue: HAL CCSD
Publication date: 14/12/2015
Field of study

This dissertation is a summary of a line of research, that I wasactively involved in, on learning in databases from examples. Thisresearch focused on traditional as well as novel database models andlanguages for querying, transforming, and describing the schema of adatabase. In case of schemas our contributions involve proposing anoriginal languages for the emerging data models of Unordered XML andRDF. We have studied learning from examples of schemas for UnorderedXML, schemas for RDF, twig queries for XML, join queries forrelational databases, and XML transformations defined with a novelmodel of tree-to-word transducers.Investigating learnability of the proposed languages required us toexamine closely a number of their fundamental properties, often ofindependent interest, including normal forms, minimization,containment and equivalence, consistency of a set of examples, andfinite characterizability. Good understanding of these propertiesallowed us to devise learning algorithms that explore a possibly largesearch space with the help of a diligently designed set ofgeneralization operations in search of an appropriate solution.Learning (or inference) is a problem that has two parameters: theprecise class of languages we wish to infer and the type of input thatthe user can provide. We focused on the setting where the user inputconsists of positive examples i.e., elements that belong to the goallanguage, and negative examples i.e., elements that do not belong tothe goal language. In general using both negative and positiveexamples allows to learn richer classes of goal languages than usingpositive examples alone. However, using negative examples is oftendifficult because together with positive examples they may cause thesearch space to take a very complex shape and its exploration may turnout to be computationally challenging.Ce mémoire est une courte présentation d’une direction de recherche, à laquelle j’ai activementparticipé, sur l’apprentissage pour les bases de données à partir d’exemples. Cette recherches’est concentrée sur les modèles et les langages, aussi bien traditionnels qu’émergents, pourl’interrogation, la transformation et la description du schéma d’une base de données. Concernantles schémas, nos contributions consistent en plusieurs langages de schémas pour les nouveaumodèles de bases de données que sont XML non-ordonné et RDF. Nous avons ainsi étudiél’apprentissage à partir d’exemples des schémas pour XML non-ordonné, des schémas pour RDF,des requêtes twig pour XML, les requêtes de jointure pour bases de données relationnelles et lestransformations XML définies par un nouveau modèle de transducteurs arbre-à-mot.Pour explorer si les langages proposés peuvent être appris, nous avons été obligés d’examinerde près un certain nombre de leurs propriétés fondamentales, souvent souvent intéressantespar elles-mêmes, y compris les formes normales, la minimisation, l’inclusion et l’équivalence, lacohérence d’un ensemble d’exemples et la caractérisation finie. Une bonne compréhension de cespropriétés nous a permis de concevoir des algorithmes d’apprentissage qui explorent un espace derecherche potentiellement très vaste grâce à un ensemble d’opérations de généralisation adapté àla recherche d’une solution appropriée.L’apprentissage (ou l’inférence) est un problème à deux paramètres : la classe précise delangage que nous souhaitons inférer et le type d’informations que l’utilisateur peut fournir. Nousnous sommes placés dans le cas où l’utilisateur fournit des exemples positifs, c’est-à-dire deséléments qui appartiennent au langage cible, ainsi que des exemples négatifs, c’est-à-dire qui n’enfont pas partie. En général l’utilisation à la fois d’exemples positifs et négatifs permet d’apprendredes classes de langages plus riches que l’utilisation uniquement d’exemples positifs. Toutefois,l’utilisation des exemples négatifs est souvent difficile parce que les exemples positifs et négatifspeuvent rendre la forme de l’espace de recherche très complexe, et par conséquent, son explorationinfaisable

Thèses en Ligne

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Programming Using Automata and Transducers

Author: D\u27antoni Loris
Publication venue: ScholarlyCommons
Publication date: 01/01/2015
Field of study

Automata, the simplest model of computation, have proven to be an effective tool in reasoning about programs that operate over strings. Transducers augment automata to produce outputs and have been used to model string and tree transformations such as natural language translations. The success of these models is primarily due to their closure properties and decidable procedures, but good properties come at the price of limited expressiveness. Concretely, most models only support finite alphabets and can only represent small classes of languages and transformations. We focus on addressing these limitations and bridge the gap between the theory of automata and transducers and complex real-world applications: Can we extend automata and transducer models to operate over structured and infinite alphabets? Can we design languages that hide the complexity of these formalisms? Can we define executable models that can process the input efficiently? First, we introduce succinct models of transducers that can operate over large alphabets and design BEX, a language for analysing string coders. We use BEX to prove the correctness of UTF and BASE64 encoders and decoders. Next, we develop a theory of tree transducers over infinite alphabets and design FAST, a language for analysing tree-manipulating programs. We use FAST to detect vulnerabilities in HTML sanitizers, check whether augmented reality taggers conflict, and optimize and analyze functional programs that operate over lists and trees. Finally, we focus on laying the foundations of stream processing of hierarchical data such as XML files and program traces. We introduce two new efficient and executable models that can process the input in a left-to-right linear pass: symbolic visibly pushdown automata and streaming tree transducers. Symbolic visibly pushdown automata are closed under Boolean operations and can specify and efficiently monitor complex properties for hierarchical structures over infinite alphabets. Streaming tree transducers can express and efficiently process complex XML transformations while enjoying decidable procedures

CiteSeerX

ScholarlyCommons@Penn

XQTC: A Static Type-Checker for XQuery Using Backward Type Inference

Author: Genevès Pierre
Layaïda Nabil
Vanoirbeek Christine
Publication venue: HAL CCSD
Publication date: 01/11/2012
Field of study

We present a novel technique and a tool for static type-checking of XQuery programs. The tool looks for errors in the program by jointly analyzing the source code of the program, input and output schemas that respectively describe the sets of documents admissible as input and as output of the program. The crux and the novelty of our results reside in the joint use of backward type inference and a two-way logic to represent inferred tree type portions. This allowed us to design and implement a type-checker for XQuery which is more precise and supports a larger XQuery fragment compared to the approaches previously proposed in the literature; in particular compared to the only few actually implemented static type-checkers such as the one in Galax. The whole system uses compilers and a satisfiability solver for deciding containment for two-way regular tree expressions. Our tool takes an XQuery program and two schemas Sin and Sout as input. If the program is found incorrect, then it automatically generates a counter-example valid w.r.t. Sin and such that the program produces an invalid output w.r.t Sout. This counter-example can be used by the programmer to fix the program.Nous présentons une technique nouvelle et un outil pour le contrôle de type statique des programmes XQuery. L'outil recherche les erreurs dans le programme en analysant à la fois le code source du programme et les schémas d'entrée et de sortie qui décrivent respectivement les ensembles de documents admissibles en entrée et en sortie. L'originalité de nos résultats réside dans l'utilisation conjointe de l'inférence de type arrière et d'une logique avec programmes inverses pour représenter des fragments de types d'arbre. Cela nous a permis de concevoir et de réaliser un contrôleur de type pour XQuery qui est plus précis et supporte un fragment de XQuery plus large que les approches proposées précédemment dans la littérature, en particulier si on se réfère aux quelques contrôleurs de type statiques effectivement réalisés, tel que celui de Galax. L'ensemble du système utilise des compilateurs et un solveur pour décider de l'inclusion des expressions d'arbres régulières bi-directionnelles. Notre outil prend en entrée un programme XQuery et deux schémas Sin et Sout. Si le programme est reconnu incorrect, l'outil engendre automatiquement un contre-exemple valide vis-à-vis de Sin et tel que le programme produise un résultat invalide vis-à-vis de Sout. Ce contre-exemple peut alors être utilisé par le programmeur pour corriger son programme

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL-Rennes 1