10,327 research outputs found

    A Method for Mapping XML DTD to Relational Schemas In The Presence Of Functional Dependencies

    Get PDF
    The eXtensible Markup Language (XML) has recently emerged as a standard for data representation and interchange on the web. As a lot of XML data in the web, now the pressure is to manage the data efficiently. Given the fact that relational databases are the most widely used technology for managing and storing XML, therefore XML needs to map to relations and this process is one that occurs frequently. There are many different ways to map and many approaches exist in the literature especially considering the flexible nesting structures that XML allows. This gives rise to the following important problem: Are some mappings ‘better’ than the others? To approach this problem, the classical relational database design through normalization technique that based on known functional dependency concept is referred. This concept is used to specify the constraints that may exist in the relations and guide the design while removing semantic data redundancies. This approach leads to a good normalized relational schema without data redundancy. To achieve a good normalized relational schema for XML, there is a need to extend the concept of functional dependency in relations to XML and use this concept as guidance for the design. Even though there exist functional dependency definitions for XML, but these definitions are not standard yet and still having several limitation. Due to the limitations of the existing definitions, constraints in the presence of shared and local elements that exist in XML document cannot be specified. In this study a new definition of functional dependency constraints for XML is proposed that are general enough to specify constraints and to discover semantic redundancies in XML documents. The focus of this study is on how to produce an optimal mapping approach in the presence of XML functional dependencies (XFD), keys and Data Type Definition (DTD) constraints, as a guidance to generate a good relational schema. To approach the mapping problem, three different components are explored: the mapping algorithm, functional dependency for XML, and implication process. The study of XML implication is important to imply what other dependencies that are guaranteed to hold in a relational representation of XML, given that a set of functional dependencies holds in the XML document. This leads to the needs of deriving a set of inference rules for the implication process. In the presence of DTD and userdefined XFD, other set of XFDs that are guaranteed to hold in XML can be generated using the set of inference rules. This mapping algorithm has been developed within the tool called XtoR. The quality of the mapping approach has been analyzed, and the result shows that the mapping approach (XtoR) significantly improve in terms of generating a good relational schema for XML with respect to reduce data and relation redundancy, remove dangling relations and remove association problems. The findings suggest that if one wants to use RDBMS to manage XML data, the mapping from XML document to relations must based be on functional dependency constraints

    XML document design via GN-DTD

    Get PDF
    Designing a well-structured XML document is important for the sake of readability and maintainability. More importantly, this will avoid data redundancies and update anomalies when maintaining a large quantity of XML based documents. In this paper, we propose a method to improve XML structural design by adopting graphical notations for Document Type Definitions (GN-DTD), which is used to describe the structure of an XML document at the schema level. Multiples levels of normal forms for GN-DTD are proposed on the basis of conceptual model approaches and theories of normalization. The normalization rules are applied to transform a poorly designed XML document into a well-designed based on normalized GN-DTD, which is illustrated through examples

    Algorithms and implementation of functional dependency discovery in XML : a thesis presented in partial fulfilment of the requirements for the degree of Master of Information Sciences in Information Systems at Massey University

    Get PDF
    1.1 Background Following the advent of the web, there has been a great demand for data interchange between applications using internet infrastructure. XML (extensible Markup Language) provides a structured representation of data empowered by broad adoption and easy deployment. As a subset of SGML (Standard Generalized Markup Language), XML has been standardized by the World Wide Web Consortium (W3C) [Bray et al., 2004], XML is becoming the prevalent data exchange format on the World Wide Web and increasingly significant in storing semi-structured data. After its initial release in 1996, it has evolved and been applied extensively in all fields where the exchange of structured documents in electronic form is required. As with the growing popularity of XML, the issue of functional dependency in XML has recently received well deserved attention. The driving force for the study of dependencies in XML is it is as crucial to XML schema design, as to relational database(RDB) design [Abiteboul et al., 1995]

    Spud 1.0: generalising and automating the user interfaces of scientific computer models

    No full text
    The interfaces by which users specify the scenarios to be simulated by scientific computer models are frequently primitive, under-documented and ad-hoc text files which make using the model in question difficult and error-prone and significantly increase the development cost of the model. In this paper, we present a model-independent system, Spud, which formalises the specification of model input formats in terms of formal grammars. This is combined with an automated graphical user interface which guides users to create valid model inputs based on the grammar provided, and a generic options reading module, libspud, which minimises the development cost of adding model options. <br><br> Together, this provides a user friendly, well documented, self validating user interface which is applicable to a wide range of scientific models and which minimises the developer input required to maintain and extend the model interface

    A Call to Arms: Revisiting Database Design

    Get PDF
    Good database design is crucial to obtain a sound, consistent database, and - in turn - good database design methodologies are the best way to achieve the right design. These methodologies are taught to most Computer Science undergraduates, as part of any Introduction to Database class. They can be considered part of the "canon", and indeed, the overall approach to database design has been unchanged for years. Moreover, none of the major database research assessments identify database design as a strategic research direction. Should we conclude that database design is a solved problem? Our thesis is that database design remains a critical unsolved problem. Hence, it should be the subject of more research. Our starting point is the observation that traditional database design is not used in practice - and if it were used it would result in designs that are not well adapted to current environments. In short, database design has failed to keep up with the times. In this paper, we put forth arguments to support our viewpoint, analyze the root causes of this situation and suggest some avenues of research.Comment: Removed spurious column break. Nothing else was change

    Storing and Querying Probabilistic XML Using a Probabilistic Relational DBMS

    Get PDF
    This work explores the feasibility of storing and querying probabilistic XML in a probabilistic relational database. Our approach is to adapt known techniques for mapping XML to relational data such that the possible worlds are preserved. We show that this approach can work for any XML-to-relational technique by adapting a representative schema-based (inlining) as well as a representative schemaless technique (XPath Accelerator). We investigate the maturity of probabilistic rela- tional databases for this task with experiments with one of the state-of- the-art systems, called Trio

    XML for Domain Viewpoints

    Get PDF
    Within research institutions like CERN (European Organization for Nuclear Research) there are often disparate databases (different in format, type and structure) that users need to access in a domain-specific manner. Users may want to access a simple unit of information without having to understand detail of the underlying schema or they may want to access the same information from several different sources. It is neither desirable nor feasible to require users to have knowledge of these schemas. Instead it would be advantageous if a user could query these sources using his or her own domain models and abstractions of the data. This paper describes the basis of an XML (eXtended Markup Language) framework that provides this functionality and is currently being developed at CERN. The goal of the first prototype was to explore the possibilities of XML for data integration and model management. It shows how XML can be used to integrate data sources. The framework is not only applicable to CERN data sources but other environments too.Comment: 9 pages, 6 figures, conference report from SCI'2001 Multiconference on Systemics & Informatics, Florid
    corecore