380 research outputs found

    Functional dependencies over XML documents with DTDs

    Get PDF
    In this article an axiomatisation for functional dependencies over XML documents is presented. The approach is based on a representation of XML document type definitions (or XML schemata) by nested attributes using constructors for records, disjoint unions and lists, and a particular null value, which covers optionality. Infinite structures that may result from referencing attributes in XML are captured by rational trees. Using a partial order on nested attributes we obtain non-distributive Brouwer algebras. The operations of the Brouwer algebra are exploited in the soundness and completeness proofs for derivation rules for functional dependencies

    On Incomplete XML Documents with Integrity Constraints

    Get PDF
    Abstract. We consider incomplete specifications of XML documents in the presence of schema information and integrity constraints. We show that integrity constraints such as keys and foreign keys affect consistency of such specifications. We prove that the consistency problem for incomplete specifications with keys and foreign keys can always be solved in NP. We then show a dichotomy result, classifying the complexity of the problem as NP-complete or PTIME, depending on the precise set of features used in incomplete descriptions.

    Algorithms and implementation of functional dependency discovery in XML : a thesis presented in partial fulfilment of the requirements for the degree of Master of Information Sciences in Information Systems at Massey University

    Get PDF
    1.1 Background Following the advent of the web, there has been a great demand for data interchange between applications using internet infrastructure. XML (extensible Markup Language) provides a structured representation of data empowered by broad adoption and easy deployment. As a subset of SGML (Standard Generalized Markup Language), XML has been standardized by the World Wide Web Consortium (W3C) [Bray et al., 2004], XML is becoming the prevalent data exchange format on the World Wide Web and increasingly significant in storing semi-structured data. After its initial release in 1996, it has evolved and been applied extensively in all fields where the exchange of structured documents in electronic form is required. As with the growing popularity of XML, the issue of functional dependency in XML has recently received well deserved attention. The driving force for the study of dependencies in XML is it is as crucial to XML schema design, as to relational database(RDB) design [Abiteboul et al., 1995]

    Characterization of XML Functional Dependencies and their Interaction with DTDs

    Full text link
    With the rise of XML as a standard model of data exchange, XML functional dependencies (XFDs) have become important to areas such as key analysis, document normalization, and data integrity. XFDs are more complicated than relational functional dependencies because the set of XFDs satisfied by an XML document depends not only on the document values, but also the tree structure and corresponding DTD. In particular, constraints imposed by DTDs may alter the implications from a base set of XFDs, and may even be inconsistent with a set of XFDs. In this paper we examine the interaction between XFDs and DTDs. We present a sound and complete axiomatization for XFDs, both alone and in the presence of certain classes of DTDs. We show that these DTD classes form an axiomatic hierarchy, with the axioms at each level a proper superset of the previous. Furthermore, we show that consistency checking with respect to a set of XFDs is feasible for these same classes

    XML documents schema design

    Get PDF
    The eXtensible Markup Language (XML) is fast emerging as the dominant standard for storing, describing and interchanging data among various systems and databases on the intemet. It offers schema such as Document Type Definition (DTD) or XML Schema Definition (XSD) for defining the syntax and structure of XML documents. To enable efficient usage of XML documents in any application in large scale electronic environment, it is necessary to avoid data redundancies and update anomalies. Redundancy and anomalies in XML documents can lead not only to higher data storage cost but also to increased costs for data transfer and data manipulation.To overcome this problem, this thesis proposes to establish a formal framework of XML document schema design. To achieve this aim, we propose a method to improve and simplify XML schema design by incorporating a conceptual model of the DTD with a theory of database normalization. A conceptual diagram, Graph-Document Type Definition (G-DTD) is proposed to describe the structure of XML documents at the schema level. For G- DTD itself, we define a structure which incorporates attributes, simple elements, complex elements, and relationship types among them. Furthermore, semantic constraints are also precisely defined in order to capture semantic meanings among the defined XML objects.In addition, to provide a guideline to a well-designed schema for XML documents, we propose a set of normal forms for G-DTD on the basis of rules proposed by Arenas and Libkin and Lv. et al. The corresponding normalization rules to transform from a G- DTD into a normal form schema are also discussed. A case study is given to illustrate the applicability of the concept. As a result, we found that the new normal forms are more concise and practical, in particular as they allow the user to find an 'optimal' structure of XML elements/attributes at the schema level. To prove that our approach is applicable for the database designer, we develop a prototype of XML document schema design using a Z formal specification language. Finally, using the same case study, this formal specification is tested to check for correctness and consistency of the specification. Thus, this gives a confidence that our prototype can be implemented successfully to generate an automatic XML schema design

    A hybrid logic for XML reference constraints

    Get PDF
    XML emerged as the (meta) mark-up language for representing, exchanging, and storing semistructured data. The structure of an XML document may be specified either through DTD (Document Type Definition) language or through the specific language XML Schema. While the expressiveness of XML Schema allows one to specify both the structure and constraints for XML documents, DTD does not allow the specification of integrity constraints for XML documents. On the other side, DTD has a very compact notation opposed to the complex notation and syntax of XML Schema. Thus, it becomes important to consider the issue of how to express further constraints on DTD-based XML documents, still retaining the simplicity and succinctness of DTDs. According to this scenario, in this paper we focus on a (as much as possible) simple logic, named XHyb, expressive enough to allow the specification of the most common integrity and reference constraints in XML documents. In particular, we focus on constraints on ID and IDREF(S) attributes, which are the common way of logically connecting parts of XML documents, besides the usual parent-child relationship of XML elements. Differently from other previously proposed hybrid logics, in XHyb IDREF(S) attributes are explicitly expressible by means of suitable syntactical constructors. Moreover, we propose a refinement of the usual graph representation of XML documents in order to represent XML documents in a formal and intuitive way without flatten accessibility through IDREF(S) to the usual parent-child relationship. Model checking algorithms are then proposed, to verify that a given XML document satisfies the considered constraints

    Generating Nested XML Documents with Dtd from Relational Views

    Get PDF
    Converting relational database into XML is increasing daily for publishing and exchanging data on the web. Most of the current approaches and tools for generating XML documents from relational database generate flat XML documents that contain data redundancy which leads to produce a massive data on the web. Other approaches assume that the relational database for generating nested XML documents is normalized. In addition, these approaches have problem that lies in the difficult of how to specify the parent elements from the children elements in the nested XML document. Moreover, most of the current approaches and tools do not generate nested XML documents automatically. They require the user to specify the constraints and the schema of the target document. This research proposes an approach to automatically generate nested XML documents from flat relational database views that are unnormalized. The research aims to reduce data redundancy and storage sizes for the generated XML documents. The proposed approach consists of three steps. The first step is converting flat relational view into nested relational view. The second is generating DTD from the nested relational view. The third is generating nested XML document from the nested relational view. The proposed approach is evaluated and compared to other approaches such as NeT, CoT, and Cost-Based and tools such as Allora, Altova, and DbToXml with respect to two measurements: data redundancy and storage size of the document. The first measurement includes several parameters that are number of data values, elements, attributes, and tags. Based on the results of comparing the proposed approach to several other approaches and tools, the proposed approach is more efficient for reducing data redundancy and storage size of XML documents. It can reduce data redundancy and storage size by approximately 50% and 55%, respectively

    XML document design via GN-DTD

    Get PDF
    Designing a well-structured XML document is important for the sake of readability and maintainability. More importantly, this will avoid data redundancies and update anomalies when maintaining a large quantity of XML based documents. In this paper, we propose a method to improve XML structural design by adopting graphical notations for Document Type Definitions (GN-DTD), which is used to describe the structure of an XML document at the schema level. Multiples levels of normal forms for GN-DTD are proposed on the basis of conceptual model approaches and theories of normalization. The normalization rules are applied to transform a poorly designed XML document into a well-designed based on normalized GN-DTD, which is illustrated through examples

    Transducer-Based Rewriting Games for Active XML

    Get PDF
    Context-free games are two-player rewriting games that are played on nested strings representing XML documents with embedded function symbols. These games were introduced to model rewriting processes for intensional documents in the Active XML framework, where input documents are to be rewritten into a given target schema by calls to external services. This paper studies the setting where dependencies between inputs and outputs of service calls are modelled by transducers, which has not been examined previously. It defines transducer models operating on nested words and studies their properties, as well as the computational complexity of the winning problem for transducer-based context-free games in several scenarios. While the complexity of this problem is quite high in most settings (ranging from NP-complete to undecidable), some tractable restrictions are also identified.Comment: Extended version of MFCS 2016 conference pape
    corecore