174 research outputs found
Using Ontologies for Semantic Data Integration
While big data analytics is considered as one of the most important paths to competitive advantage of todayâs enterprises, data scientists spend a comparatively large amount of time in the data preparation and data integration phase of a big data project. This shows that data integration is still a major challenge in IT applications. Over the past two decades, the idea of using semantics for data integration has become increasingly crucial, and has received much attention in the AI, database, web, and data mining communities. Here, we focus on a specific paradigm for semantic data integration, called Ontology-Based Data Access (OBDA). The goal of this paper is to provide an overview of OBDA, pointing out both the techniques that are at the basis of the paradigm, and the main challenges that remain to be addressed
Ontology-Based Data Access and Integration
An ontology-based data integration (OBDI) system is an information management system consisting of three components: an ontology, a set of data sources, and the mapping between the two. The ontology is a conceptual, formal description of the domain of interest to a given organization (or a community of users), expressed in terms of relevant concepts, attributes of concepts, relationships between concepts, and logical assertions characterizing the domain knowledge. The data sources are the repositories accessible by the organization where data concerning the domain are stored. In the general case, such repositories are numerous, heterogeneous, each one managed and maintained independently from the others. The mapping is a precise specification of the correspondence between the data contained in the data sources and the elements of the ontology. The main purpose of an OBDI system is to allow information consumers to query the data using the elements in the ontology as predicates.
In the special case where the organization manages a single data source, the term ontology-based data access (ODBA) system is used
Tree-like Queries in OWL 2 QL: Succinctness and Complexity Results
This paper investigates the impact of query topology on the difficulty of
answering conjunctive queries in the presence of OWL 2 QL ontologies. Our first
contribution is to clarify the worst-case size of positive existential (PE),
non-recursive Datalog (NDL), and first-order (FO) rewritings for various
classes of tree-like conjunctive queries, ranging from linear queries to
bounded treewidth queries. Perhaps our most surprising result is a
superpolynomial lower bound on the size of PE-rewritings that holds already for
linear queries and ontologies of depth 2. More positively, we show that
polynomial-size NDL-rewritings always exist for tree-shaped queries with a
bounded number of leaves (and arbitrary ontologies), and for bounded treewidth
queries paired with bounded depth ontologies. For FO-rewritings, we equate the
existence of polysize rewritings with well-known problems in Boolean circuit
complexity. As our second contribution, we analyze the computational complexity
of query answering and establish tractability results (either NL- or
LOGCFL-completeness) for a range of query-ontology pairs. Combining our new
results with those from the literature yields a complete picture of the
succinctness and complexity landscapes for the considered classes of queries
and ontologies.Comment: This is an extended version of a paper accepted at LICS'15. It
contains both succinctness and complexity results and adopts FOL notation.
The appendix contains proofs that had to be omitted from the conference
version for lack of space. The previous arxiv version (a long version of our
DL'14 workshop paper) only contained the succinctness results and used
description logic notatio
On the SPARQL Direct Semantics Entailment Regime for OWL 2 QL
OWL 2 QL is the profile of OWL 2 targeted to Ontology-Based
Data Access (OBDA) scenarios, where large amount of data are to be accessed and thus query answering is required to be especially efficient in the size of such data, namely AC0 in data complexity. On the other hand, the syntax and the semantics of the SPARQL query language for OWL 2 is defined by means of the Direct Semantics Entailment Regime (DSER), which considers queries including any assertion expressible in the language of the queried ontology, i.e., both ABox atoms, TBox atoms and inequalities expressed by means of DifferentIndividuals atoms. Thus, in this paper, we investigate query answering over OWL 2 QL under DSER. In particular, we show that, by virtue of the restricted meaning assigned to existential variables and union, query answering can be reduced to the evaluation of a Datalog program. Finally, we investigate query answering under a new SPARQL entailment regime, called Direct Semantics Answering Regime (DSAR), obtained by modifying DSER in such a way that existentially quantified variables are assigned the classical logical meaning, and provide an algorithm for answering queries over OWL 2 QL ontologies under DSAR, that is AC0 in data complexity, for a class of queries comprising both TBox atoms, ABox atoms and inequalities
Managing data through the lens of an ontology
Ontology-based data management aims at managing data through the lens of an ontology, that is, a conceptual representation of the domain of interest in the underlying information system. This new paradigm provides several interesting features, many of which have already been proved effective in managing complex information systems. This article introduces the notion of ontology-based data management, illustrating the main ideas underlying the paradigm, and pointing out the importance of knowledge representation and automated reasoning for addressing the technical challenges it introduces
On the succinctness of query rewriting over shallow ontologies
We investigate the succinctness problem for conjunctive query rewritings over OWL2QL ontologies of depth 1 and 2 by means of hypergraph programs computing Boolean functions. Both positive and negative results are obtained. We show that, over ontologies of depth 1, conjunctive queries have polynomial-size nonrecursive datalog rewritings; tree-shaped queries have polynomial positive existential rewritings; however, in the worst case, positive existential rewritings can be superpolynomial. Over ontologies of depth 2, positive existential and nonrecursive datalog rewritings of conjunctive queries can suffer an exponential blowup, while first-order rewritings can be superpolynomial unless NP ïżœis included in P/poly. We also analyse rewritings of tree-shaped queries over arbitrary ontologies and note that query entailment for such queries is fixed-parameter tractable
Ontology-based data access: ontop of databases
We present the architecture and technologies underpinning the OBDA system Ontop and taking full advantage of storing data in relational databases. We discuss the theoretical foundations of Ontop: the tree-witness query rewriting, T-mappings and optimisations based on database integrity constraints and SQL features. We analyse the performance of Ontop in a series of experiments and demonstrate that, for standard ontologies, queries and data stored in relational databases, Ontop is fast, efficient and produces SQL rewritings of high quality
- âŠ