1,965 research outputs found
Schema Independent Relational Learning
Learning novel concepts and relations from relational databases is an
important problem with many applications in database systems and machine
learning. Relational learning algorithms learn the definition of a new relation
in terms of existing relations in the database. Nevertheless, the same data set
may be represented under different schemas for various reasons, such as
efficiency, data quality, and usability. Unfortunately, the output of current
relational learning algorithms tends to vary quite substantially over the
choice of schema, both in terms of learning accuracy and efficiency. This
variation complicates their off-the-shelf application. In this paper, we
introduce and formalize the property of schema independence of relational
learning algorithms, and study both the theoretical and empirical dependence of
existing algorithms on the common class of (de) composition schema
transformations. We study both sample-based learning algorithms, which learn
from sets of labeled examples, and query-based algorithms, which learn by
asking queries to an oracle. We prove that current relational learning
algorithms are generally not schema independent. For query-based learning
algorithms we show that the (de) composition transformations influence their
query complexity. We propose Castor, a sample-based relational learning
algorithm that achieves schema independence by leveraging data dependencies. We
support the theoretical results with an empirical study that demonstrates the
schema dependence/independence of several algorithms on existing benchmark and
real-world datasets under (de) compositions
Recommended from our members
Structure identification in relational data
This paper presents several investigations into the prospects for identifying meaningful structures in empirical data, namely, structures permitting effective organization of the data to meet requirements of future queries. We propose a general framework whereby the notion of identifiability is given a precise formal definition similar to that of learnability. Using this framework, we then explore if a tractable procedure exists for deciding whether a given relation is decomposable into a constraint network or a CNF theory with desirable topology and, if the answer is positive, identifying the desired decomposition. Finally, we address the problem of expressing a given relation as a Horn theory and, if this is impossible, finding the best k-Horn approximation to the given relation. We show that both problems can be solved in time polynomial in the length of the data
Implementing Groundness Analysis with Definite Boolean Functions
The domain of definite Boolean functions, Def, can be used to express the groundness of, and trace grounding dependencies between, program variables in (constraint) logic programs. In this paper, previously unexploited computational properties of Def are utilised to develop an efficient and succinct groundness analyser that can be coded in Prolog. In particular, entailment checking is used to prevent unnecessary least upper bound calculations. It is also demonstrated that join can be defined in terms of other operations, thereby eliminating code and removing the need for preprocessing formulae to a normal form. This saves space and time. Furthermore, the join can be adapted to straightforwardly implement the downward closure operator that arises in set sharing analyses. Experimental results indicate that the new Def implementation gives favourable results in comparison with BDD-based groundness analyses
Functional Dependencies in OWL ABox
Functional Dependency (FD) has been extensively studied in database theory. Most recently there have been some works investigating the implications of extending Description Logics with functional dependencies. In particular the OWL ontology language offers the functional property property allowing simple functional dependency to be specified. As it turns out, more complex FD specified as concept constructors has been proved to lead to undecidability in the general case, which restricts its usage as part of TBOX. This paper departs from previous ones by restricting FDs applicability to instances in the ABOX. We specify FD as a new constructor, an OWL concept. FD instances are mapped to Horn clauses and evaluated against the ABOX according to user’s desired behavior. The latter allows users to determine whether FDs should be interpreted as constraints, assertions or views. Our approach gives ontology users data guarantees usually found in databases, integrated with the ontology conceptual model
Inductive Logic Programming in Databases: from Datalog to DL+log
In this paper we address an issue that has been brought to the attention of
the database community with the advent of the Semantic Web, i.e. the issue of
how ontologies (and semantics conveyed by them) can help solving typical
database problems, through a better understanding of KR aspects related to
databases. In particular, we investigate this issue from the ILP perspective by
considering two database problems, (i) the definition of views and (ii) the
definition of constraints, for a database whose schema is represented also by
means of an ontology. Both can be reformulated as ILP problems and can benefit
from the expressive and deductive power of the KR framework DL+log. We
illustrate the application scenarios by means of examples. Keywords: Inductive
Logic Programming, Relational Databases, Ontologies, Description Logics, Hybrid
Knowledge Representation and Reasoning Systems. Note: To appear in Theory and
Practice of Logic Programming (TPLP).Comment: 30 pages, 3 figures, 2 tables
Polynomial conjunctive query rewriting under unary inclusion dependencies
Ontology-based data access (OBDA) is widely accepted as an important ingredient of the new generation of information systems. In the OBDA paradigm, potentially incomplete relational data is enriched by means of ontologies, representing intensional knowledge of the application domain. We consider the problem of conjunctive query answering in OBDA. Certain ontology languages have been identified as FO-rewritable (e.g., DL-Lite and sticky-join sets of TGDs), which means that the ontology can be incorporated into the user's query, thus reducing OBDA to standard relational query evaluation. However, all known query rewriting techniques produce queries that are exponentially large in the size of the user's query, which can be a serious issue for standard relational database engines. In this paper, we present a polynomial query rewriting for conjunctive queries under unary inclusion dependencies. On
the other hand, we show that binary inclusion dependencies do not admit
polynomial query rewriting algorithms
Proving Finite Satisfiability of Deductive Databases
It is shown how certain refutation methods can be extended into semi-decision
procedures that are complete for both unsatisfiability and finite satisfiability. The proposed extension
is justified by a new characterization of finite satisfiability. This research was motivated
by a database design problem: Deduction rules and integrity constraints in definite databases
have to be finitely satisfiabl
- …