133,920 research outputs found
Characterization of order-like dependencies with formal concept analysis
Functional Dependencies (FDs) play a key role in many fields
of the relational database model, one of the most widely used database
systems. FDs have also been applied in data analysis, data quality, knowl-
edge discovery and the like, but in a very limited scope, because of their
fixed semantics. To overcome this limitation, many generalizations have
been defined to relax the crisp definition of FDs. FDs and a few of their
generalizations have been characterized with Formal Concept Analysis
which reveals itself to be an interesting unified framework for charac-
terizing dependencies, that is, understanding and computing them in a
formal way. In this paper, we extend this work by taking into account
order-like dependencies. Such dependencies, well defined in the database
field, consider an ordering on the domain of each attribute, and not sim-
ply an equality relation as with standard FDs.Peer ReviewedPostprint (published version
A Rule-Based Approach to Analyzing Database Schema Objects with Datalog
Database schema elements such as tables, views, triggers and functions are
typically defined with many interrelationships. In order to support database
users in understanding a given schema, a rule-based approach for analyzing the
respective dependencies is proposed using Datalog expressions. We show that
many interesting properties of schema elements can be systematically determined
this way. The expressiveness of the proposed analysis is exemplarily shown with
the problem of computing induced functional dependencies for derived relations.
The propagation of functional dependencies plays an important role in data
integration and query optimization but represents an undecidable problem in
general. And yet, our rule-based analysis covers all relational operators as
well as linear recursive expressions in a systematic way showing the depth of
analysis possible by our proposal. The analysis of functional dependencies is
well-integrated in a uniform approach to analyzing dependencies between schema
elements in general.Comment: Pre-proceedings paper presented at the 27th International Symposium
on Logic-Based Program Synthesis and Transformation (LOPSTR 2017), Namur,
Belgium, 10-12 October 2017 (arXiv:1708.07854
Functional dependencies for XML : axiomatisation and normal form in the presence of frequencies and identifiers : a thesis presented in partial fulfilment of the requirements for the degree of Master of Sciences in Information Sciences at Massey University, Palmerston North, New Zealand
XML has gained popularity as a markup language for publishing and exchanging data on the web. Nowadays, there are also ongoing interests in using XML for representing and actually storing data. In particular, much effort has been directed towards turning XML into a real data model by improving the semantics that can be expressed about XML documents. Various works have addressed how to define different classes of integrity constraints and the development of a normalisation theory for XML. One area which received little to no attention from the research community up to five years ago is the study of functional dependencies in the context of XML [37]. Since then, there has been increasingly more research investigating functional dependencies in XML. Nevertheless, a comprehensive dependency theory and normalisation theory for XML have yet to emerge. Functional dependencies are an integral part of database theory in the relational data model (RDM). In particular, functional dependencies have been vital in the investigation of how to design "good" relational database schemas which avoid or minimise problems relating to data redundancy and data inconsistency. Since the same problems can be shown to exist in poorly designed XML schemas
1
, there is a need to investigate how these problems can be eliminated in the context of XML. We believe that the study of an analogy to relational functional dependencies in the context of XML is equally significant towards designing "good" XML schemas.
[FROM INTRODUCTION
Expressive Completeness of Existential Rule Languages for Ontology-based Query Answering
Existential rules, also known as data dependencies in Databases, have been
recently rediscovered as a promising family of languages for Ontology-based
Query Answering. In this paper, we prove that disjunctive embedded dependencies
exactly capture the class of recursively enumerable ontologies in
Ontology-based Conjunctive Query Answering (OCQA). Our expressive completeness
result does not rely on any built-in linear order on the database. To establish
the expressive completeness, we introduce a novel semantic definition for OCQA
ontologies. We also show that neither the class of disjunctive tuple-generating
dependencies nor the class of embedded dependencies is expressively complete
for recursively enumerable OCQA ontologies.Comment: 10 pages; the full version of a paper to appear in IJCAI 2016.
Changes (regarding to v1): a new reference has been added, and some typos
have been correcte
A SAT-based System for Consistent Query Answering
An inconsistent database is a database that violates one or more integrity
constraints, such as functional dependencies. Consistent Query Answering is a
rigorous and principled approach to the semantics of queries posed against
inconsistent databases. The consistent answers to a query on an inconsistent
database is the intersection of the answers to the query on every repair, i.e.,
on every consistent database that differs from the given inconsistent one in a
minimal way. Computing the consistent answers of a fixed conjunctive query on a
given inconsistent database can be a coNP-hard problem, even though every fixed
conjunctive query is efficiently computable on a given consistent database.
We designed, implemented, and evaluated CAvSAT, a SAT-based system for
consistent query answering. CAvSAT leverages a set of natural reductions from
the complement of consistent query answering to SAT and to Weighted MaxSAT. The
system is capable of handling unions of conjunctive queries and arbitrary
denial constraints, which include functional dependencies as a special case. We
report results from experiments evaluating CAvSAT on both synthetic and
real-world databases. These results provide evidence that a SAT-based approach
can give rise to a comprehensive and scalable system for consistent query
answering.Comment: 25 pages including appendix, to appear in the 22nd International
Conference on Theory and Applications of Satisfiability Testin
Securing Databases from Probabilistic Inference
Databases can leak confidential information when users combine query results
with probabilistic data dependencies and prior knowledge. Current research
offers mechanisms that either handle a limited class of dependencies or lack
tractable enforcement algorithms. We propose a foundation for Database
Inference Control based on ProbLog, a probabilistic logic programming language.
We leverage this foundation to develop Angerona, a provably secure enforcement
mechanism that prevents information leakage in the presence of probabilistic
dependencies. We then provide a tractable inference algorithm for a practically
relevant fragment of ProbLog. We empirically evaluate Angerona's performance
showing that it scales to relevant security-critical problems.Comment: A short version of this paper has been accepted at the 30th IEEE
Computer Security Foundations Symposium (CSF 2017
Probabilistic Relational Model Benchmark Generation
The validation of any database mining methodology goes through an evaluation
process where benchmarks availability is essential. In this paper, we aim to
randomly generate relational database benchmarks that allow to check
probabilistic dependencies among the attributes. We are particularly interested
in Probabilistic Relational Models (PRMs), which extend Bayesian Networks (BNs)
to a relational data mining context and enable effective and robust reasoning
over relational data. Even though a panoply of works have focused, separately ,
on the generation of random Bayesian networks and relational databases, no work
has been identified for PRMs on that track. This paper provides an algorithmic
approach for generating random PRMs from scratch to fill this gap. The proposed
method allows to generate PRMs as well as synthetic relational data from a
randomly generated relational schema and a random set of probabilistic
dependencies. This can be of interest not only for machine learning researchers
to evaluate their proposals in a common framework, but also for databases
designers to evaluate the effectiveness of the components of a database
management system
- …