4 research outputs found

    Characterization of XML Functional Dependencies and their Interaction with DTDs

    Full text link
    With the rise of XML as a standard model of data exchange, XML functional dependencies (XFDs) have become important to areas such as key analysis, document normalization, and data integrity. XFDs are more complicated than relational functional dependencies because the set of XFDs satisfied by an XML document depends not only on the document values, but also the tree structure and corresponding DTD. In particular, constraints imposed by DTDs may alter the implications from a base set of XFDs, and may even be inconsistent with a set of XFDs. In this paper we examine the interaction between XFDs and DTDs. We present a sound and complete axiomatization for XFDs, both alone and in the presence of certain classes of DTDs. We show that these DTD classes form an axiomatic hierarchy, with the axioms at each level a proper superset of the previous. Furthermore, we show that consistency checking with respect to a set of XFDs is feasible for these same classes

    On XML integrity constraints in the presence of DTDs

    Get PDF
    The paper investigates XML document specifications with DTDs and integrity constraints, such as keys and foreign keys. We study the consistency problem of checking whether a given specification is meaningful: that is, whether there exists an XML document that both conforms to the DTD and satisfies the constraints. We show that DTDs interact with constraints in a highly intricate way and as a result, the consistency problem in general is undecidable. When it comes to unary keys and foreign keys, the consistency problem is shown to be NP-complete. This is done by coding DTDs and integrity constraints with linear constraints on the integers. We consider the variations of the problem (by both restricting and enlarging the class of constraints), and identify a number of tractable cases, as well as a number of additional NP-complete ones. By incorporating negations of constraints, we establish complexity bounds on the implication problem, which is shown to be coNP-complete for unary keys and foreign keys.

    Reducing End-User Burden in Everyday Data Organization.

    Full text link
    As digital data permeates every aspect of our daily life, more and more end-users are organizing their everyday data electronically. In fact, end-users are already used to managing their personal data such as contact books and calendars in electronic devices. Meanwhile, the desire for organizing more information into the computer is expanding for a broader group of users. For example, a scientist may need to regularly manage a substantial amount of science data on his desktop. However, to organize such everyday data is challenging for these end-users, because they have limited knowledge about data schema, which is key to data management tasks such as database design, data transformation and data integration. While the user is struggling with these schema tasks, various cognitive and operational burdens emerge. First, when designing her data collection, the user has the burden to abstract her mental model of her real-life data into a reasonable schema design. Moreover, when incorporating external data sources, there is a burden to understand the source semantics and a burden to transform the data from those sources into the user's own data collection. Meanwhile, if the user wants to filter the data, she has the burden to understand and specify the selection condition. Finally, when existing sources are update, there is a burden to understand and fuse these updates. This dissertation introduces various approaches to help the end-user reduce these burdens. To ease the design pain, the dissertation proposes a system with a next-generation spreadsheet for the end-user to easily design and evolve her schema. To facilitate incorporation of external data sources, a sample-driven schema mapping approach is introduced so that the user can freely provide sample instances in her own collection and the system will automatically deduce the desired schema mapping from the sources to the collection. In a similar flavor, this dissertation proposes an approach to facilitate the user in specifying selection conditions via example data points she wants to select. Finally, to help the user incorporate source data updates into her data collection, the dissertation proposes a technique to incrementally update the integrated data using previous integration results.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/99778/1/eql_1.pd
    corecore