34 research outputs found

    XML documents schema design

    Get PDF
    The eXtensible Markup Language (XML) is fast emerging as the dominant standard for storing, describing and interchanging data among various systems and databases on the intemet. It offers schema such as Document Type Definition (DTD) or XML Schema Definition (XSD) for defining the syntax and structure of XML documents. To enable efficient usage of XML documents in any application in large scale electronic environment, it is necessary to avoid data redundancies and update anomalies. Redundancy and anomalies in XML documents can lead not only to higher data storage cost but also to increased costs for data transfer and data manipulation.To overcome this problem, this thesis proposes to establish a formal framework of XML document schema design. To achieve this aim, we propose a method to improve and simplify XML schema design by incorporating a conceptual model of the DTD with a theory of database normalization. A conceptual diagram, Graph-Document Type Definition (G-DTD) is proposed to describe the structure of XML documents at the schema level. For G- DTD itself, we define a structure which incorporates attributes, simple elements, complex elements, and relationship types among them. Furthermore, semantic constraints are also precisely defined in order to capture semantic meanings among the defined XML objects.In addition, to provide a guideline to a well-designed schema for XML documents, we propose a set of normal forms for G-DTD on the basis of rules proposed by Arenas and Libkin and Lv. et al. The corresponding normalization rules to transform from a G- DTD into a normal form schema are also discussed. A case study is given to illustrate the applicability of the concept. As a result, we found that the new normal forms are more concise and practical, in particular as they allow the user to find an 'optimal' structure of XML elements/attributes at the schema level. To prove that our approach is applicable for the database designer, we develop a prototype of XML document schema design using a Z formal specification language. Finally, using the same case study, this formal specification is tested to check for correctness and consistency of the specification. Thus, this gives a confidence that our prototype can be implemented successfully to generate an automatic XML schema design

    Physical Design for Non-relational Data Systems

    Get PDF
    Decades of research have gone into the optimization of physical designs, query execution, and related tools for relational databases. These techniques and tools make it possible for non-expert users to make effective use of relational database management systems. However, the drive for flexible data models and increased scalability has spawned a new generation of data management systems which largely eschew the relational model. These include systems such as NoSQL databases and distributed analytics frameworks such as Apache Spark which make use of a diverse set of data models. Optimization techniques and tools developed for relational data do not directly apply in this setting. This leaves developers making use of these systems with the need to become intimately familiar with system details to obtain good performance. We present techniques and tools for physical design for non-relational data systems. We explore two settings: NoSQL database systems and distributed analytics frameworks. While NoSQL databases often avoid explicit schema definitions, many choices on how to structure data remain. These choices can have a significant impact on application performance. The data structuring process normally requires expert knowledge of the underlying database. We present the NoSQL Schema Evaluator (NoSE). Given a target workload, NoSE provides an optimized physical design for NoSQL database applications which compares favourably to schemas designed by expert users. To enable existing applications to benefit from conceptual modeling, we also present an algorithm to recover a logical model from a denormalized database instance. Our second setting is distributed analytics frameworks such as Apache Spark. As is the case for NoSQL databases, expert knowledge of Spark is often required to construct efficient data pipelines. In NoSQL systems, a key challenge is how to structure stored data, while in Spark, a key challenge is how to cache intermediate results. We examine a particularly common scenario in Spark which involves performing iterative analysis on an input dataset. We show that jobs written in an intuitive manner using existing Spark APIs can have poor performance. We propose ReSpark, which automates caching decisions for iterative Spark analyses. Like NoSE, ReSpark makes it possible for non-expert users to obtain good performance from a non-relational data system

    Computer data base assessment of masonry bridges.

    Get PDF
    SIGLEAvailable from British Library Document Supply Centre- DSC:D81974 / BLDSC - British Library Document Supply CentreGBUnited Kingdo

    Design and implementation of inventory databases.

    Get PDF
    http://archive.org/details/designimplementa00sariNAN

    An Introduction to Database Systems

    Get PDF
    This textbook introduces the basic concepts of database systems. These concepts are presented through numerous examples in modeling and design. The material in this book is geared to an introductory course in database systems offered at the junior or senior level of Computer Science. It could also be used in a first year graduate course in database systems, focusing on a selection of the advanced topics in the latter chapters

    The development of the mathematical department of the Educational Times from 1847 to 1862.

    Get PDF
    Mathematics held an important place in the first twelve of years of the Educational Times (1847-1923), and in November 1848 a department of mathematical questions and solutions was launched. In 1864 this department was reprinted in a daughter journal: Mathematical Questions with Their solutions from The Educational Times (MQ). This thesis concentrates on the development of this department from its inception until 1862, when William John Clarke Miller became its editor; and is considered in terms of the editors, contributors and mathematics. To facilitate this research, a source-oriented database using K L E I O (kleio) software was constructed. It contains data taken from the questions and solutions and also miscellaneous items from the journal. Database analysis was used in conjunction with traditional, archival sources; for example, the respective, previously unknown correspondence of two of the main contributors, Thomas Turner Wilkinson and Miller. The development of the department fell into two main periods: the early 1850s when it was edited by Richard Wilson then James Wharton and had an educational bias; and the late 1850s when it was dominated by Miller and Stephen Watson who contributed moderately complex problems of a reasonably high standard on conic sections, probability and number theory. In 1850 Miller started contributing with a group of pupils and masters, including Robert Harley, from the Dissenters' College, Taunton. Another group of contributors which emerged was one of northern geometers, with whom Wilkinson was connected. He collaborated with Thomas Stephens Davies on geometry and this influenced his contributions to the department. Miller edited the department from 1862 to 1897 and MQ from 1863 to 1897 and made MO an international journal of renown for its original research. It contained contributions from some of the most eminent national and international mathematicians, including Cayley, Sylvester, Hirst and Clifford. The start of this new phase is briefly introduced and reviewed

    To Heck With Ethics: Thinking About Public Issues With a Framework for CS Students

    Get PDF
    This paper proposes that the ethics class in the CS curriculum incorporate the Lawrence Lessig model of regulation as an analytical tool for social issues. Lessig’s use of the notion of architecture, the rules and boundaries of the sometimes artificial world within which social issues play out, is particularly resonant with computing professionals. The CS curriculum guidelines include only ethical frameworks as the tool for our students to engage with societal issues. The regulation framework shows how the market, law, social norms, and architecture can all be applied toward understanding social issues

    The design considerations and development of a simulator for the backtesting of investment strategies

    Get PDF
    The skill of accurately predicting the optimal time to buy or sell shares on the stock market is one that has been actively sought by both experienced and novice investors since the advent of the stock exchange in the early 1930s. Since then, the finance industry has employed a plethora of techniques to improve the prediction power of the investor. This thesis is an investigation into one of those techniques and the advancement of this technique through the use of computational power. The technique of portfolio strategy backtesting as a vehicle to achieve improved predictive power is one that has existed within financial services for decades. Portfolio backtesting, as alluded to by its name, is the empirical testing of an investment strategy to determine how the strategy would have performed historically, with a view that past performance may be indicative of future performance

    The semantic database model as a basis for an automated database design tool

    Get PDF
    Bibliography: p.257-80.The automatic database design system is a design aid for network database creation. It obtains a requirements specification from a user and generates a prototype database. This database is compatible with the Data Definition Language of DMS 1100, the database system on the Univac 1108 at the University of Cape Town. The user interface has been constructed in such a way that a computer-naive user can submit a description of his organisation to the system. Thus it constitutes a powerful database design tool, which should greatly alleviate the designer's tasks of communicating with users, and of creating an initial database definition. The requirements are formulated using the semantic database model, and semantic information in this model is incorporated into the database as integrity constraints. A relation scheme is also generated from the specification. As a result of this research, insight has been gained into the advantages and shortcomings of the semantic database model, and some principles for 'good' data models and database design methodologies have emerged

    A Survey on Mapping Semi-Structured Data and Graph Data to Relational Data

    Get PDF
    The data produced by various services should be stored and managed in an appropriate format for gaining valuable knowledge conveniently. This leads to the emergence of various data models, including relational, semi-structured, and graph models, and so on. Considering the fact that the mature relational databases established on relational data models are still predominant in today's market, it has fueled interest in storing and processing semi-structured data and graph data in relational databases so that mature and powerful relational databases' capabilities can all be applied to these various data. In this survey, we review existing methods on mapping semi-structured data and graph data into relational tables, analyze their major features, and give a detailed classification of those methods. We also summarize the merits and demerits of each method, introduce open research challenges, and present future research directions. With this comprehensive investigation of existing methods and open problems, we hope this survey can motivate new mapping approaches through drawing lessons from eachmodel's mapping strategies, aswell as a newresearch topic - mapping multi-model data into relational tables.Peer reviewe
    corecore