20,867 research outputs found
Schema Independent Relational Learning
Learning novel concepts and relations from relational databases is an
important problem with many applications in database systems and machine
learning. Relational learning algorithms learn the definition of a new relation
in terms of existing relations in the database. Nevertheless, the same data set
may be represented under different schemas for various reasons, such as
efficiency, data quality, and usability. Unfortunately, the output of current
relational learning algorithms tends to vary quite substantially over the
choice of schema, both in terms of learning accuracy and efficiency. This
variation complicates their off-the-shelf application. In this paper, we
introduce and formalize the property of schema independence of relational
learning algorithms, and study both the theoretical and empirical dependence of
existing algorithms on the common class of (de) composition schema
transformations. We study both sample-based learning algorithms, which learn
from sets of labeled examples, and query-based algorithms, which learn by
asking queries to an oracle. We prove that current relational learning
algorithms are generally not schema independent. For query-based learning
algorithms we show that the (de) composition transformations influence their
query complexity. We propose Castor, a sample-based relational learning
algorithm that achieves schema independence by leveraging data dependencies. We
support the theoretical results with an empirical study that demonstrates the
schema dependence/independence of several algorithms on existing benchmark and
real-world datasets under (de) compositions
Tree-width for first order formulae
We introduce tree-width for first order formulae \phi, fotw(\phi). We show
that computing fotw is fixed-parameter tractable with parameter fotw. Moreover,
we show that on classes of formulae of bounded fotw, model checking is fixed
parameter tractable, with parameter the length of the formula. This is done by
translating a formula \phi\ with fotw(\phi)<k into a formula of the k-variable
fragment L^k of first order logic. For fixed k, the question whether a given
first order formula is equivalent to an L^k formula is undecidable. In
contrast, the classes of first order formulae with bounded fotw are fragments
of first order logic for which the equivalence is decidable.
Our notion of tree-width generalises tree-width of conjunctive queries to
arbitrary formulae of first order logic by taking into account the quantifier
interaction in a formula. Moreover, it is more powerful than the notion of
elimination-width of quantified constraint formulae, defined by Chen and Dalmau
(CSL 2005): for quantified constraint formulae, both bounded elimination-width
and bounded fotw allow for model checking in polynomial time. We prove that
fotw of a quantified constraint formula \phi\ is bounded by the
elimination-width of \phi, and we exhibit a class of quantified constraint
formulae with bounded fotw, that has unbounded elimination-width. A similar
comparison holds for strict tree-width of non-recursive stratified datalog as
defined by Flum, Frick, and Grohe (JACM 49, 2002).
Finally, we show that fotw has a characterization in terms of a cops and
robbers game without monotonicity cost
- …