36,332 research outputs found

    Database queries and constraints via lifting problems

    Full text link
    Previous work has demonstrated that categories are useful and expressive models for databases. In the present paper we build on that model, showing that certain queries and constraints correspond to lifting problems, as found in modern approaches to algebraic topology. In our formulation, each so-called SPARQL graph pattern query corresponds to a category-theoretic lifting problem, whereby the set of solutions to the query is precisely the set of lifts. We interpret constraints within the same formalism and then investigate some basic properties of queries and constraints. In particular, to any database π\pi we can associate a certain derived database \Qry(\pi) of queries on π\pi. As an application, we explain how giving users access to certain parts of \Qry(\pi), rather than direct access to π\pi, improves ones ability to manage the impact of schema evolution

    Database Learning: Toward a Database that Becomes Smarter Every Time

    Full text link
    In today's databases, previous query answers rarely benefit answering future queries. For the first time, to the best of our knowledge, we change this paradigm in an approximate query processing (AQP) context. We make the following observation: the answer to each query reveals some degree of knowledge about the answer to another query because their answers stem from the same underlying distribution that has produced the entire dataset. Exploiting and refining this knowledge should allow us to answer queries more analytically, rather than by reading enormous amounts of raw data. Also, processing more queries should continuously enhance our knowledge of the underlying distribution, and hence lead to increasingly faster response times for future queries. We call this novel idea---learning from past query answers---Database Learning. We exploit the principle of maximum entropy to produce answers, which are in expectation guaranteed to be more accurate than existing sample-based approximations. Empowered by this idea, we build a query engine on top of Spark SQL, called Verdict. We conduct extensive experiments on real-world query traces from a large customer of a major database vendor. Our results demonstrate that Verdict supports 73.7% of these queries, speeding them up by up to 23.0x for the same accuracy level compared to existing AQP systems.Comment: This manuscript is an extended report of the work published in ACM SIGMOD conference 201

    Functorial Data Migration

    Get PDF
    In this paper we present a simple database definition language: that of categories and functors. A database schema is a small category and an instance is a set-valued functor on it. We show that morphisms of schemas induce three "data migration functors", which translate instances from one schema to the other in canonical ways. These functors parameterize projections, unions, and joins over all tables simultaneously and can be used in place of conjunctive and disjunctive queries. We also show how to connect a database and a functional programming language by introducing a functorial connection between the schema and the category of types for that language. We begin the paper with a multitude of examples to motivate the definitions, and near the end we provide a dictionary whereby one can translate database concepts into category-theoretic concepts and vice-versa.Comment: 30 page

    Design and implementation of an integrated surface texture information system for design, manufacture and measurement

    Get PDF
    The optimised design and reliable measurement of surface texture are essential to guarantee the functional performance of a geometric product. Current support tools are however often limited in functionality, integrity and efficiency. In this paper, an integrated surface texture information system for design, manufacture and measurement, called “CatSurf”, has been designed and developed, which aims to facilitate rapid and flexible manufacturing requirements. A category theory based knowledge acquisition and knowledge representation mechanism has been devised to retrieve and organize knowledge from various Geometrical Product Specifications (GPS) documents in surface texture. Two modules (for profile and areal surface texture) each with five components are developed in the CatSurf. It also focuses on integrating the surface texture information into a Computer-aided Technology (CAx) framework. Two test cases demonstrate design process of specifications for the profile and areal surface texture in AutoCAD and SolidWorks environments respectively
    • …
    corecore