15 research outputs found
Knowledge Refinement via Rule Selection
In several different applications, including data transformation and entity
resolution, rules are used to capture aspects of knowledge about the
application at hand. Often, a large set of such rules is generated
automatically or semi-automatically, and the challenge is to refine the
encapsulated knowledge by selecting a subset of rules based on the expected
operational behavior of the rules on available data. In this paper, we carry
out a systematic complexity-theoretic investigation of the following rule
selection problem: given a set of rules specified by Horn formulas, and a pair
of an input database and an output database, find a subset of the rules that
minimizes the total error, that is, the number of false positive and false
negative errors arising from the selected rules. We first establish
computational hardness results for the decision problems underlying this
minimization problem, as well as upper and lower bounds for its
approximability. We then investigate a bi-objective optimization version of the
rule selection problem in which both the total error and the size of the
selected rules are taken into account. We show that testing for membership in
the Pareto front of this bi-objective optimization problem is DP-complete.
Finally, we show that a similar DP-completeness result holds for a bi-level
optimization version of the rule selection problem, where one minimizes first
the total error and then the size
A collective, probabilistic approach to schema mapping using diverse noisy evidence
We propose a probabilistic approach to the problem of schema mapping. Our approach is declarative, scalable, and extensible. It builds upon recent results in both schema mapping and probabilistic reasoning and contributes novel techniques in both fields. We introduce the problem of schema mapping selection, that is, choosing the best mapping from a space of potential mappings, given both metadata constraints and a data example. As selection has to reason holistically about the inputs and the dependencies between the chosen mappings, we define a new schema mapping optimization problem which captures interactions between mappings as well as inconsistencies and incompleteness in the input. We then introduce Collective Mapping Discovery (CMD), our solution to this problem using state-of-the-art probabilistic reasoning techniques. Our evaluation on a wide range of integration scenarios, including several real-world domains, demonstrates that CMD effectively combines data and metadata information to infer highly accurate mappings even with significant levels of noise
Rewritability in Monadic Disjunctive Datalog, MMSNP, and Expressive Description Logics
We study rewritability of monadic disjunctive Datalog programs, (the
complements of) MMSNP sentences, and ontology-mediated queries (OMQs) based on
expressive description logics of the ALC family and on conjunctive queries. We
show that rewritability into FO and into monadic Datalog (MDLog) are decidable,
and that rewritability into Datalog is decidable when the original query
satisfies a certain condition related to equality. We establish
2NExpTime-completeness for all studied problems except rewritability into MDLog
for which there remains a gap between 2NExpTime and 3ExpTime. We also analyze
the shape of rewritings, which in the MMSNP case correspond to obstructions,
and give a new construction of canonical Datalog programs that is more
elementary than existing ones and also applies to formulas with free variables
Unique characterisability and learnability of Temporal Instance Queries
We aim to determine which temporal instance queries can be uniquely characterised by a (polynomial-size) set of positive and negative temporal data examples. We start by considering queries formulated in fragments of propositional linear temporal logic \LTL{} that correspond to conjunctive queries (CQs) or extensions thereof induced by the until operator. Not all of these queries admit polynomial characterisations but by restricting them further to path-shaped queries we identify natural classes that do.
%imposing a further restriction to path-shaped queries we identify natural classes that do.
We then investigate how far the obtained characterisations can be lifted to temporal knowledge graphs queried by 2D languages combining LTL with concepts in description logics EL or ELI (i.e., tree-shaped CQs). While temporal operators in the scope of description logic constructors can destroy polynomial characterisability, we obtain general transfer results for the case when description logic constructors are within the scope of temporal operators. Finally, we apply our characterisations to establish (polynomial) learnability of temporal instance queries using membership queries in the active learning framework
Pushing Optimal ABox Repair from EL Towards More Expressive Horn-DLs
Ontologies based on Description Logic (DL) represent general background knowledge in a terminology (TBox) and the actual data in an ABox. DL systems can then be used to compute consequences (such as answers to certain queries) from an ontology consisting of a TBox and an ABox. Since both human-made and machine-learned data sets may contain errors, which manifest themselves as unintuitive or obviously incorrect consequences, repairing DL-based ontologies in the sense of removing such unwanted consequences is an important topic in DL research. Most of the repair approaches described in the literature produce repairs that are not optimal, in the sense that they do not guarantee that only a minimal set of consequences is removed. In a series of papers, we have developed an approach for computing optimal repairs, starting with the restricted setting of an EL instance store, extending this to the more general setting of a quantified ABox (where some individuals may be anonymous), and then adding a static EL TBox.
Here, we extend the expressivity of the underlying DL considerably, by adding nominals, inverse roles, regular role inclusions and the bottom concept to EL, which yields a fragment of the well-known DL Horn-SROIQ. The ideas underlying our repair approach still apply to this DL, though several non-trivial extensions are needed to deal with the new constructors and axioms. The developed repair approach can also be used to treat unwanted consequences expressed by certain conjunctive queries or regular path queries, and to handle Horn-ALCOI TBoxes with regular role inclusions.This is an extended version of an article accepted at KR 2022
Unique Characterisability and Learnability of Temporal Instance Queries
We aim to determine which temporal instance queries can be uniquely characterised by a (polynomial-size) set of positive and negative temporal data examples. We start by considering queries formulated in fragments of propositional linear temporal logic LTL that correspond to conjunctive queries (CQs) or extensions thereof induced by the until operator. Not all of these queries admit polynomial characterisations, but by imposing a further restriction to path-shaped queries we identify natural classes that do. We then investigate how far the obtained characterisations can be lifted to temporal knowledge graphs queried by 2D languages combining LTL with concepts in description logics EL or ELI (i.e., tree-shaped CQs). While temporal operators in the scope of description logic constructors can destroy polynomial characterisability, we obtain general transfer results for the case when description logic constructors are within the scope of temporal operators. Finally, we apply our characterisations to establish (polynomial) learnability of temporal instance queries using membership queries in the active learning framework.</jats:p