78 research outputs found

    Tractable XML data exchange via relations

    Get PDF
    We consider data exchange for XML documents: given source and target schemas, a mapping between them, and a document conforming to the source schema, construct a target document and answer target queries in a way that is consistent with source information. The problem has primarily been studied in the relational context, in which data-exchange systems have also been built. Since many XML documents are stored in relations, it is natural to consider using a relational system for XML data exchange. However, there is a complexity mismatch between query answering in relational and XML data exchange, which indicates that restrictions have to be imposed on XML schemas and mappings, and on XML shredding schemes, to make the use of relational systems possible. We isolate a set of five requirements that must be fulfilled in order to have a faithful representation of the XML data-exchange problem by a relational translation. We then demonstrate that these requirements naturally suggest the inlining technique for dataexchange tasks. Our key contribution is to provide shredding algorithms for schemas, documents, mappings and queries, and demonstrate that they enable us to correctly perform XML data-exchange tasks using a relational system

    XML data exchange:Consistency and query answering

    Get PDF
    Data exchange is the problem of finding an instance of a target schema, given an instance of a source schema and a specification of the relationship between the source and the target. Theoretical foundations of data exchange have recently been investigated for relational data. In this article, we start looking into the basic properties of XML data exchange, that is, restructuring of XML documents that conform to a source DTD under a target DTD, and answering queries written over the target schema. We define XML data exchange settings in which source-to-target dependencies refer to the hierarchical structure of the data. Combining DTDs and dependencies makes some XML data exchange settings inconsistent. We investigate the consistency problem and determine its exact complexity. We then move to query answering, and prove a dichotomy theorem that classifies data exchange settings into those over which query answering is tractable, and those over which it is coNP-complete, depending on classes of regular expressions used in DTDs. Furthermore, for all tractable cases we give polynomial-time algorithms that compute target XML documents over which queries can be answered

    XML documents schema design

    Get PDF
    The eXtensible Markup Language (XML) is fast emerging as the dominant standard for storing, describing and interchanging data among various systems and databases on the intemet. It offers schema such as Document Type Definition (DTD) or XML Schema Definition (XSD) for defining the syntax and structure of XML documents. To enable efficient usage of XML documents in any application in large scale electronic environment, it is necessary to avoid data redundancies and update anomalies. Redundancy and anomalies in XML documents can lead not only to higher data storage cost but also to increased costs for data transfer and data manipulation.To overcome this problem, this thesis proposes to establish a formal framework of XML document schema design. To achieve this aim, we propose a method to improve and simplify XML schema design by incorporating a conceptual model of the DTD with a theory of database normalization. A conceptual diagram, Graph-Document Type Definition (G-DTD) is proposed to describe the structure of XML documents at the schema level. For G- DTD itself, we define a structure which incorporates attributes, simple elements, complex elements, and relationship types among them. Furthermore, semantic constraints are also precisely defined in order to capture semantic meanings among the defined XML objects.In addition, to provide a guideline to a well-designed schema for XML documents, we propose a set of normal forms for G-DTD on the basis of rules proposed by Arenas and Libkin and Lv. et al. The corresponding normalization rules to transform from a G- DTD into a normal form schema are also discussed. A case study is given to illustrate the applicability of the concept. As a result, we found that the new normal forms are more concise and practical, in particular as they allow the user to find an 'optimal' structure of XML elements/attributes at the schema level. To prove that our approach is applicable for the database designer, we develop a prototype of XML document schema design using a Z formal specification language. Finally, using the same case study, this formal specification is tested to check for correctness and consistency of the specification. Thus, this gives a confidence that our prototype can be implemented successfully to generate an automatic XML schema design

    XQuery containment in presence of variable binding dependencies

    Full text link

    Heterogeneous verification of model transformations

    Get PDF
    Esta tesis trata sobre la verificación formal en el contexto de la Ingeniería Dirigida por Modelos (MDE por sus siglas en inglés). El paradigma propone un ciclo de vida de la ingeniería de software basado en una abstracción de su complejidad a través de la definición de modelos y en un proceso de construcción (semi)automático guiado por transformaciones de estos modelos. Nuestro propósito es abordar la verificación de transformaciones de modelos la cual incluye, por extensión, la verificación de sus modelos. Comenzamos analizando la literatura relacionada con la verificación de transformaciones de modelos para concluir que la heterogeneidad de las propiedades que interesa verificar y de los enfoques para hacerlo, sugiere la necesidad de utilizar diversos dominios lógicos, lo cual es la base de nuestra propuesta. En algunos casos puede ser necesario realizar una verificación heterogénea, es decir, utilizar diferentes formalismos para la verificación de cada una de las partes del problema completo. Además, es beneficioso permitir a los expertos formales elegir el dominio en el que se encuentran más capacitados para llevar a cabo una prueba formal. El principal problema reside en que el mantenimiento de múltiples representaciones formales de los elementos de MDE en diferentes dominios lógicos, puede ser costoso si no existe soporte automático o una relación formal clara entre estas representaciones. Motivados por esto, definimos un entorno unificado que permite la verificación formal transformaciones de modelos mediante el uso de métodos de verificación heterogéneos, de forma tal que es posible automatizar la traducción formal de los elementos de MDE entre dominios logicos. Nos basamos formalmente en la Teoría de Instituciones, la cual proporciona una base sólida para la representación de los elementos de MDE (a través de instituciones) sin depender de ningúningún dominio lógico específico. También proporciona una forma de especificar traducciones (a través de comorfismos) que preservan la semántica entre estos elementos y otros dominios lógicos. Nos basamos en estándares para la especificación de los elementos de MDE. De hecho, definimos una institución para la buena formación de los modelos especificada con una versión simplificada del MetaObject Facility y otra institución para transformaciones utilizando Query/View/Transformation Relations. No obstante, la idea puede ser generalizada a otros enfoques de transformación y lenguajes.Por último, demostramos la viabilidad del entorno mediante el desarrollo de un prototipo funcional soportado por el Heterogeneous Tool Set (HETS). HETS permite realizar una especificación heterogénea y provee facilidades para el monitoreo de su corrección global. Los elementos de MDE se conectan con otras lógicas ya soportadas en HETS (por ejemplo: lógica de primer orden, lógica modal, entre otras) a través del Common Algebraic Specification Language (CASL). Esta conexión se expresa teóricamente mediante comorfismos desde las instituciones de MDE a la institución subyacente en CASL. Finalmente, discutimos las principales contribuciones de la tesis. Esto deriva en futuras líneas de investigación que contribuyen a la adopción de métodos formales para la verificación en el contexto de MDE.This thesis is about formal verification in the context of the Model-Driven Engineering (MDE) paradigm. The paradigm proposes a software engineering life-cycle based on an abstraction from its complexity by defining models, and on a (semi)automatic construction process driven by model transformations. Our purpose is to address the verification of model transformations which includes, by extension, the verification of their models. We first review the literature on the verification of model transformations to conclude that the heterogeneity we find in the properties of interest to verify, and in the verification approaches, suggests the need of using different logical domains, which is the base of our proposal. In some cases it can be necessary to perform a heterogeneous verification, i.e. using different formalisms for the verification of each part of the whole problem. Moreover, it is useful to allow formal experts to choose the domain in which they are more skilled to address a formal proof. The main problem is that the maintenance of multiple formal representations of the MDE elements in different logical domains, can be expensive if there is no automated assistance or a clear formal relation between these representations. Motivated by this, we define a unified environment that allows formal verification of model transformations using heterogeneous verification approaches, in such a way that the formal translations of the MDE elements between logical domains can be automated. We formally base the environment on the Theory of Institutions, which provides a sound basis for representing MDE elements (as so called institutions) without depending on any specific logical domain. It also provides a way for specifying semantic-preserving translations (as so called comorphisms) from these elements to other logical domains. We use standards for the specification of the MDE elements. In fact, we define an institution for the well-formedness of models specified with a simplified version of the MetaObject Facility, and another institution for Query/View/Transformation Relations transformations. However, the idea can be generalized to other transformation approaches and languages. Finally, we evidence the feasibility of the environment by the development of a functional prototype supported by the Heterogeneous Tool Set (HETS). HETS supports heterogeneous specifications and provides capabilities for monitoring their overall correctness. The MDE elements are connected to the other logics already supported in HETS (e.g. first-order logic, modal logic, among others) through the Common Algebraic Specification Language (CASL). This connection is defined by means of comorphisms from the MDE institutions to the underlying institution of CASL. We carry out a final discussion of the main contributions of this thesis. This results in future research directions which contribute with the adoption of formal tools for the verification in the context of MDE

    XML data exchange under expressive mappings

    Get PDF
    Data Exchange is the problem of transforming data in one format (the source schema) into data in another format (the target schema). Its core component is a schema mapping, which is a high level specification of how such transformation should be done. Relational data exchange has been extensively studied, but exchanging XML data have been paid much less attention. The goal of this thesis is to develop a theory of XML data exchange with expressive schema mappings, extending a previous work using restricted mappings. Our mapping language is based on tree patterns that can use horizontal navigation and data comparison in addition to downward navigation. First we look at static analysis problems concerning a single mapping. More specif- ically, we consider consistency problems with different flavours. One such problem, for instance, asks if any tree has a solution under the given mapping. Then we turn to analyse the complexity of mapping themselves, i.e., recognising pairs of trees such that the one is mapped to the other. For both problems, we provide classifications based on sets of features used in the mappings. Second we investigate the composition of XML schema mappings. Generally it is hard, or rather simply impossible, to achieve closure under composition in XML settings unlike in relational settings. Nevertheless we identify a class of XML schema mappings that is closed under composition. Lastly we consider the problem of query answering. It is important to exchange data so that we can feasibly answer queries while it often leads to intractability. We identify the dividing line between tractable and intractable cases: answering queries with extended features is always intractable while tractability of answering simple queries can be retained in extended mappings

    Representing and Querying Incomplete Information: a Data Interoperability Perspective

    Get PDF
    This habilitation thesis presents some of my most recent work, which has been done in collaboration with several other people. In particular this thesis concentrates on our contributions to the study of incomplete information in the context of data interoperability. In this scenario data is heterogenous and decentralized, needs to be integrated from several sources and exchanged between different applications. Incompleteness, i.e. the presence of “missing” or “unknown” portions of data, is naturally generated in data exchange and integration, due to data heterogeneity. The management of incomplete information poses new challenges in this context.The focus of our study is the development of models of incomplete information suitable to data interoperability tasks, and the study of techniques for efficiently querying several forms of incompleteness

    A Functional, Comprehensive and Extensible Multi-Platform Querying and Transformation Approach

    Get PDF
    This thesis is about a new model querying and transformation approach called FunnyQT which is realized as a set of APIs and embedded domain-specific languages (DSLs) in the JVM-based functional Lisp-dialect Clojure. Founded on a powerful model management API, FunnyQT provides querying services such as comprehensions, quantified expressions, regular path expressions, logic-based, relational model querying, and pattern matching. On the transformation side, it supports the definition of unidirectional model-to-model transformations, of in-place transformations, it supports defining bidirectional transformations, and it supports a new kind of co-evolution transformations that allow for evolving a model together with its metamodel simultaneously. Several properties make FunnyQT unique. Foremost, it is just a Clojure library, thus, FunnyQT queries and transformations are Clojure programs. However, most higher-level services are provided as task-oriented embedded DSLs which use Clojure's powerful macro-system to support the user with tailor-made language constructs important for the task at hand. Since queries and transformations are just Clojure programs, they may use any Clojure or Java library for their own purpose, e.g., they may use some templating library for defining model-to-text transformations. Conversely, like every Clojure program, FunnyQT queries and transformations compile to normal JVM byte-code and can easily be called from other JVM languages. Furthermore, FunnyQT is platform-independent and designed with extensibility in mind. By default, it supports the Eclipse Modeling Framework and JGraLab, and support for other modeling frameworks can be added with minimal effort and without having to modify the respective framework's classes or FunnyQT itself. Lastly, because FunnyQT is embedded in a functional language, it has a functional emphasis itself. Every query and every transformation compiles to a function which can be passed around, given to higher-order functions, or be parametrized with other functions

    XPath satisfiability in the presence of DTDs

    Get PDF
    We study the satisfiability problem associated with XPath in the presence of DTDs. This is the problem of determining, given a query p in an XPath fragment and a DTD D, whether or not there exists an XML document T such that T conforms to D and the answer of p on T is nonempty. We consider a variety of XPath fragments widely used in practice, and investigate the impact of different XPath operators on the satisfiability analysis. We first study the problem for negation-free XPath fragments with and without upward axes, recursion and data-value joins, identifying which factors lead to tractability and which to NP-completeness. We then turn to fragments with negation but without data values, establishing lower and upper bounds in the absence and in the presence of upward modalities and recursion. We show that with negation the complexity ranges from PSPACE to EXPTIME. Moreover, when both data values and negation are in place, we find that the complexity ranges from NEXPTIME to undecidable. Furthermore, we give a finer analysis of the problem for particular classes of DTDs, exploring the impact of various DTD constructs, identifying tractable cases, as well as providing the complexity in the query size alone. Finally, we investigate the problem for XPath fragments with sibling axes, exploring the impact of horizontal modalities on the satisfiability analysis. © 2008 ACM
    corecore