36,332 research outputs found
Database queries and constraints via lifting problems
Previous work has demonstrated that categories are useful and expressive
models for databases. In the present paper we build on that model, showing that
certain queries and constraints correspond to lifting problems, as found in
modern approaches to algebraic topology. In our formulation, each so-called
SPARQL graph pattern query corresponds to a category-theoretic lifting problem,
whereby the set of solutions to the query is precisely the set of lifts. We
interpret constraints within the same formalism and then investigate some basic
properties of queries and constraints. In particular, to any database we
can associate a certain derived database \Qry(\pi) of queries on . As an
application, we explain how giving users access to certain parts of
\Qry(\pi), rather than direct access to , improves ones ability to
manage the impact of schema evolution
Database Learning: Toward a Database that Becomes Smarter Every Time
In today's databases, previous query answers rarely benefit answering future
queries. For the first time, to the best of our knowledge, we change this
paradigm in an approximate query processing (AQP) context. We make the
following observation: the answer to each query reveals some degree of
knowledge about the answer to another query because their answers stem from the
same underlying distribution that has produced the entire dataset. Exploiting
and refining this knowledge should allow us to answer queries more
analytically, rather than by reading enormous amounts of raw data. Also,
processing more queries should continuously enhance our knowledge of the
underlying distribution, and hence lead to increasingly faster response times
for future queries.
We call this novel idea---learning from past query answers---Database
Learning. We exploit the principle of maximum entropy to produce answers, which
are in expectation guaranteed to be more accurate than existing sample-based
approximations. Empowered by this idea, we build a query engine on top of Spark
SQL, called Verdict. We conduct extensive experiments on real-world query
traces from a large customer of a major database vendor. Our results
demonstrate that Verdict supports 73.7% of these queries, speeding them up by
up to 23.0x for the same accuracy level compared to existing AQP systems.Comment: This manuscript is an extended report of the work published in ACM
SIGMOD conference 201
Functorial Data Migration
In this paper we present a simple database definition language: that of
categories and functors. A database schema is a small category and an instance
is a set-valued functor on it. We show that morphisms of schemas induce three
"data migration functors", which translate instances from one schema to the
other in canonical ways. These functors parameterize projections, unions, and
joins over all tables simultaneously and can be used in place of conjunctive
and disjunctive queries. We also show how to connect a database and a
functional programming language by introducing a functorial connection between
the schema and the category of types for that language. We begin the paper with
a multitude of examples to motivate the definitions, and near the end we
provide a dictionary whereby one can translate database concepts into
category-theoretic concepts and vice-versa.Comment: 30 page
Design and implementation of an integrated surface texture information system for design, manufacture and measurement
The optimised design and reliable measurement of surface texture are essential to guarantee the functional performance of a geometric product. Current support tools are however often limited in functionality, integrity and efficiency. In this paper, an integrated surface texture information system for design, manufacture and measurement, called “CatSurf”, has been designed and developed, which aims to facilitate rapid and flexible manufacturing requirements. A category theory based knowledge acquisition and knowledge representation mechanism has been devised to retrieve and organize knowledge from various Geometrical Product Specifications (GPS) documents in surface texture. Two modules (for profile and areal surface texture) each with five components are developed in the CatSurf. It also focuses on integrating the surface texture information into a Computer-aided Technology (CAx) framework. Two test cases demonstrate design process of specifications for the profile and areal surface texture in AutoCAD and SolidWorks environments respectively
- …