19 research outputs found
On the first-order rewritability of conjunctive queries over binary guarded existential rules
We study conjunctive query answering and first-order rewritability of conjunctive queries for binary guarded existential rules. In particular, we prove that the problem of establishing whether a given set of binary guarded existential rules is such that all conjunctive queries admit a first-order rewriting is decidable, and present a technique for solving this problem. These results have a important practical impact, since they make it possible to identify those sets of binary guarded existential rules for which it is possible to answer every conjunctive query through query rewriting and standard evaluation of a first-order query (actually, a union of conjunctive queries) over a relational database system
Bounded Implication for Existential Rules
The property of boundedness in Datalog formalizes whether a set of rules can be equivalently expressed by a non-recursive set of rules. Existential rules extend Datalog to the presence of existential variables in rule heads. In this paper, we introduce and study notions of boundedness for existential rules. We provide a notion of weak boundedness and a notion of strong boundedness for a rule set, and show that they correspond, respectively, to the notions of first-order rewritability of atomic queries and first-order rewritability of conjunctive queries over the set. While weak and strong boundedness are in general not equivalent, we show that, for some notable subclasses of existential rules, i.e., Datalog, single-head binary rules, and frontier-guarded rules, the two notions coincide
On the first-order rewritability of conjunctive queries over binary guarded existential rules
We study conjunctive query answering and first-order rewritability of conjunctive queries for binary guarded existential rules. In particular, we prove that the problem of establishing whether a given set of binary guarded existential rules is such that all conjunctive queries admit a first-order rewriting is decidable, and present a technique for solving this problem. These results have a important practical impact, since they make it possible to identify those sets of binary guarded existential rules for which it is possible to answer every conjunctive query through query rewriting and standard evaluation of a first-order query (actually, a union of conjunctive queries) over a relational database system
Semantic Analysis of R2RML Mappings for Ontology-Based Data Access
Ontology-based data access (OBDA) deals with the problem of accessing autonomous data sources through a shared, virtual ontology, and declarative mappings connecting the data sources to the ontology. The W3C standard R2RML allows for mapping relational data sources to RDFS/OWL ontologies. In this paper, we present algorithms for the semantic analysis of R2RML mappings in the OBDA setting, when the ontology is expressed in OWL 2 QL. The focus of such algorithms is to identify the main semantical anomalies (inconsistency and redundancy) of a mapping specification with respect to the ontology and/or the data sources. Such algorithms have been implemented in the mapping analysis tool developed within the Optique European project. We also report on the experiments conducted within the Optique project use cases
Data context informed data wrangling
The process of preparing potentially large and complex data sets for further
analysis or manual examination is often called data wrangling. In classical
warehousing environments, the steps in such a process have been carried out
using Extract-Transform-Load platforms, with significant manual involvement in
specifying, configuring or tuning many of them. Cost-effective data wrangling
processes need to ensure that data wrangling steps benefit from automation
wherever possible. In this paper, we define a methodology to fully automate an
end-to-end data wrangling process incorporating data context, which associates
portions of a target schema with potentially spurious extensional data of types
that are commonly available. Instance-based evidence together with data
profiling paves the way to inform automation in several steps within the
wrangling process, specifically, matching, mapping validation, value format
transformation, and data repair. The approach is evaluated with real estate
data showing substantial improvements in the results of automated wrangling
The VADA Architecture for Cost-Effective Data Wrangling
Data wrangling, the multi-faceted process by which the data
required by an application is identified, extracted, cleaned
and integrated, is often cumbersome and labor intensive.
In this paper, we present an architecture that supports a
complete data wrangling lifecycle, orchestrates components
dynamically, builds on automation wherever possible, is informed
by whatever data is available, refines automatically
produced results in the light of feedback, takes into account
the user’s priorities, and supports data scientists with diverse
skill sets. The architecture is demonstrated in practice
for wrangling property sales and open government data