117 research outputs found
Functorial Data Migration
In this paper we present a simple database definition language: that of
categories and functors. A database schema is a small category and an instance
is a set-valued functor on it. We show that morphisms of schemas induce three
"data migration functors", which translate instances from one schema to the
other in canonical ways. These functors parameterize projections, unions, and
joins over all tables simultaneously and can be used in place of conjunctive
and disjunctive queries. We also show how to connect a database and a
functional programming language by introducing a functorial connection between
the schema and the category of types for that language. We begin the paper with
a multitude of examples to motivate the definitions, and near the end we
provide a dictionary whereby one can translate database concepts into
category-theoretic concepts and vice-versa.Comment: 30 page
Relational Foundations For Functorial Data Migration
We study the data transformation capabilities associated with schemas that
are presented by directed multi-graphs and path equations. Unlike most
approaches which treat graph-based schemas as abbreviations for relational
schemas, we treat graph-based schemas as categories. A schema is a
finitely-presented category, and the collection of all -instances forms a
category, -inst. A functor between schemas and , which can be
generated from a visual mapping between graphs, induces three adjoint data
migration functors, -inst-inst, -inst -inst, and -inst -inst. We present an algebraic query
language FQL based on these functors, prove that FQL is closed under
composition, prove that FQL can be implemented with the
select-project-product-union relational algebra (SPCU) extended with a
key-generation operation, and prove that SPCU can be implemented with FQL
Categorical Data Integration for Computational Science
Categorical Query Language is an open-source query and data integration
scripting language that can be applied to common challenges in the field of
computational science. We discuss how the structure-preserving nature of CQL
data migrations protect those who publicly share data from the
misinterpretation of their data. Likewise, this feature of CQL migrations
allows those who draw from public data sources to be sure only data which meets
their specification will actually be transferred. We argue some open problems
in the field of data sharing in computational science are addressable by
working within this paradigm of functorial data migration. We demonstrate these
tools by integrating data from the Open Quantum Materials Database with some
alternative materials databases.Comment: 10 pages, 5 figure
Database queries and constraints via lifting problems
Previous work has demonstrated that categories are useful and expressive
models for databases. In the present paper we build on that model, showing that
certain queries and constraints correspond to lifting problems, as found in
modern approaches to algebraic topology. In our formulation, each so-called
SPARQL graph pattern query corresponds to a category-theoretic lifting problem,
whereby the set of solutions to the query is precisely the set of lifts. We
interpret constraints within the same formalism and then investigate some basic
properties of queries and constraints. In particular, to any database we
can associate a certain derived database \Qry(\pi) of queries on . As an
application, we explain how giving users access to certain parts of
\Qry(\pi), rather than direct access to , improves ones ability to
manage the impact of schema evolution
Belief propagation in monoidal categories
We discuss a categorical version of the celebrated belief propagation
algorithm. This provides a way to prove that some algorithms which are known or
suspected to be analogous, are actually identical when formulated generically.
It also highlights the computational point of view in monoidal categories.Comment: In Proceedings QPL 2014, arXiv:1412.810
A Formal Category Theoretical Framework for Multi-model Data Transformations
Data integration and migration processes in polystores and multi-model database management systems highly benefit from data and schema transformations. Rigorous modeling of transformations is a complex problem. The data and schema transformation field is scattered with multiple different transformation frameworks, tools, and mappings. These are usually domain-specific and lack solid theoretical foundations. Our first goal is to define category theoretical foundations for relational, graph, and hierarchical data models and instances. Each data instance is represented as a category theoretical mapping called a functor. We formalize data and schema transformations as Kan lifts utilizing the functorial representation for the instances. A Kan lift is a category theoretical construction consisting of two mappings satisfying the certain universal property. In this work, the two mappings correspond to schema transformation and data transformation.Peer reviewe
- …