2,767 research outputs found

    A Call to Arms: Revisiting Database Design

    Get PDF
    Good database design is crucial to obtain a sound, consistent database, and - in turn - good database design methodologies are the best way to achieve the right design. These methodologies are taught to most Computer Science undergraduates, as part of any Introduction to Database class. They can be considered part of the "canon", and indeed, the overall approach to database design has been unchanged for years. Moreover, none of the major database research assessments identify database design as a strategic research direction. Should we conclude that database design is a solved problem? Our thesis is that database design remains a critical unsolved problem. Hence, it should be the subject of more research. Our starting point is the observation that traditional database design is not used in practice - and if it were used it would result in designs that are not well adapted to current environments. In short, database design has failed to keep up with the times. In this paper, we put forth arguments to support our viewpoint, analyze the root causes of this situation and suggest some avenues of research.Comment: Removed spurious column break. Nothing else was change

    Encoding databases satisfying a given set of dependencies

    Get PDF
    Consider a relation schema with a set of dependency constraints. A fundamental question is what is the minimum space where the possible instances of the schema can be "stored". We study the following model. Encode the instances by giving a function which maps the set of possible instances into the set of words of a given length over the binary alphabet in a decodable way. The problem is to find the minimum length needed. This minimum is called the information content of the database. We investigate several cases where the set of dependency constraints consist of relatively simple sets of functional or multivalued dependencies. We also consider the following natural extension. Is it possible to encode the instances such a way that small changes in the instance cause a small change in the code. © 2012 Springer-Verlag

    Twelve Theses on Reactive Rules for the Web

    Get PDF
    Reactivity, the ability to detect and react to events, is an essential functionality in many information systems. In particular, Web systems such as online marketplaces, adaptive (e.g., recommender) systems, and Web services, react to events such as Web page updates or data posted to a server. This article investigates issues of relevance in designing high-level programming languages dedicated to reactivity on the Web. It presents twelve theses on features desirable for a language of reactive rules tuned to programming Web and Semantic Web applications

    An Expressive Language and Efficient Execution System for Software Agents

    Full text link
    Software agents can be used to automate many of the tedious, time-consuming information processing tasks that humans currently have to complete manually. However, to do so, agent plans must be capable of representing the myriad of actions and control flows required to perform those tasks. In addition, since these tasks can require integrating multiple sources of remote information ? typically, a slow, I/O-bound process ? it is desirable to make execution as efficient as possible. To address both of these needs, we present a flexible software agent plan language and a highly parallel execution system that enable the efficient execution of expressive agent plans. The plan language allows complex tasks to be more easily expressed by providing a variety of operators for flexibly processing the data as well as supporting subplans (for modularity) and recursion (for indeterminate looping). The executor is based on a streaming dataflow model of execution to maximize the amount of operator and data parallelism possible at runtime. We have implemented both the language and executor in a system called THESEUS. Our results from testing THESEUS show that streaming dataflow execution can yield significant speedups over both traditional serial (von Neumann) as well as non-streaming dataflow-style execution that existing software and robot agent execution systems currently support. In addition, we show how plans written in the language we present can represent certain types of subtasks that cannot be accomplished using the languages supported by network query engines. Finally, we demonstrate that the increased expressivity of our plan language does not hamper performance; specifically, we show how data can be integrated from multiple remote sources just as efficiently using our architecture as is possible with a state-of-the-art streaming-dataflow network query engine

    Normalization Theory for XML

    Get PDF
    Abstract. Specifications of XML documents typically consist of typing information (e.g., a DTD), and integrity constraints. Just like relational schema specifications, not all are good – some are prone to redundancies and update anomalies. In the relational world we have a well-developed theory of data design (also known as normalization). A few definitions of XML normal forms have been proposed, but the main question is why a particular design is good. In the XML world, we still lack universally accepted query languages such as relational algebra, or update languages that let us reason about storage redundancies, lossless decompositions, and update anomalies. A better approach, therefore, is to come up with notions of good design based on the intrinsic properties of the model itself. We present such an approach, based on Shannon’s information theory, and show how it applies to relational normal forms as well as to XML design, for both native and relational storage.

    A 3d geoscience information system framework

    Get PDF
    Two-dimensional geographical information systems are extensively used in the geosciences to create and analyse maps. However, these systems are unable to represent the Earth's subsurface in three spatial dimensions. The objective of this thesis is to overcome this deficiency, to provide a general framework for a 3d geoscience information system (GIS), and to contribute to the public discussion about the development of an infrastructure for geological observation data, geomodels, and geoservices. Following the objective, the requirements for a 3d GIS are analysed. According to the requirements, new geologically sensible query functionality for geometrical, topological and geological properties has been developed and the integration of 3d geological modeling and data management system components in a generic framework has been accomplished. The 3d geoscience information system framework presented here is characterized by the following features: - Storage of geological observation data and geomodels in a XML-database server. According to a new data model, geological observation data can be referenced by a set of geomodels. - Functionality for querying observation data and 3d geomodels based on their 3d geometrical, topological, material, and geological properties were developed and implemented as plug-in for a 3d geomodeling user application. - For database queries, the standard XML query language has been extended with 3d spatial operators. The spatial database query operations are computed using a XML application server which has been developed for this specific purpose. This technology allows sophisticated 3d spatial and geological database queries. Using the developed methods, queries can be answered like: "Select all sandstone horizons which are intersected by the set of faults F". This request contains a topological and a geological material parameter. The combination of queries with other GIS methods, like visual and statistical analysis, allows geoscience investigations in a novel 3d GIS environment. More generally, a 3d GIS enables geologists to read and understand a 3d digital geomodel analogously as they read a conventional 2d geological map

    Static and dynamic semantics of NoSQL languages

    Get PDF
    We present a calculus for processing semistructured data that spans differences of application area among several novel query languages, broadly categorized as "NoSQL". This calculus lets users define their own operators, capturing a wider range of data processing capabilities, whilst providing a typing precision so far typical only of primitive hard-coded operators. The type inference algorithm is based on semantic type checking, resulting in type information that is both precise, and flexible enough to handle structured and semistructured data. We illustrate the use of this calculus by encoding a large fragment of Jaql, including operations and iterators over JSON, embedded SQL expressions, and co-grouping, and show how the encoding directly yields a typing discipline for Jaql as it is, namely without the addition of any type definition or type annotation in the code
    • 

    corecore