10 research outputs found
Ontology-based data access to Slegge
We report on our experience in ontology-based data access to the Slegge database at Statoil and share the resources employed in this use case: end-user information needs (in natural language), their translations into SPARQL, the Subsurface Exploration Ontology, the schema of the Slegge database with integrity constraints, and the mappings connecting the ontology and the schema
Bridging the Semantic Web and NoSQL Worlds: Generic SPARQL Query Translation and Application to MongoDB
International audienceRDF-based data integration is often hampered by the lack of methods to translate data locked in heterogeneous silos into RDF representations. In this paper, we tackle the challenge of bridging the gap between the Semantic Web and NoSQL worlds, by fostering the development of SPARQL interfaces to heterogeneous databases. To avoid defining yet another SPARQL translation method for each and every database, we propose a two-phase method. Firstly, a SPARQL query is translated into a pivot abstract query. This phase achieves as much of the translation process as possible regardless of the database. We show how optimizations at this abstract level can save subsequent work at the level of a target database query language. Secondly, the abstract query is translated into the query language of a target database, taking into account the specific database capabilities and constraints. We demonstrate the effectiveness of our method with the MongoDB NoSQL document store, such that arbitrary MongoDB documents can be aligned on existing domain ontologies and accessed with SPARQL. Finally, we draw on a real-world use case to report experimental results with respect to the effectiveness and performance of our approach
Efficient handling of SPARQL OPTIONAL for OBDA
OPTIONAL is a key feature in SPARQL for dealing with missing information. While this operator is used extensively, it is also known for its complexity, which can make efficient evaluation of queries with OPTIONAL challenging. We tackle this problem in the Ontology-Based Data Access (OBDA) setting, where the data is stored in a SQL relational database and exposed as a virtual RDF graph by means of an R2RML mapping. We start with a succinct translation of a SPARQL fragment into SQL. It fully respects bag semantics and three-valued logic and relies on the extensive use of the LEFT JOIN operator and COALESCE function. We then propose optimisation techniques for reducing the size and improving the structure of generated SQL queries. Our optimisations capture interactions between JOIN, LEFT JOIN, COALESCE and integrity constraints such as attribute nullability, uniqueness and foreign key constraints. Finally, we empirically verify effectiveness of our techniques on the BSBM OBDA benchmark
Recommended from our members
Ontology-based end-user visual query formulation: Why, what, who, how, and which?
Value creation in an organisation is a time-sensitive and data-intensive process, yet it is often delayed and bounded by the reliance on IT experts extracting data for domain experts. Hence, there is a need for providing people who are not professional developers with the flexibility to pose relatively complex and ad hoc queries in an easy and intuitive way. In this respect, visual methods for query formulation undertake the challenge of making querying independent of usersâ technical skills and the knowledge of the underlying textual query language and the structure of data. An ontology is more promising than the logical schema of the underlying data for guiding users in formulating queries, since it provides a richer vocabulary closer to the usersâ understanding. However, on the one hand, today the most of worldâs enterprise data reside in relational databases rather than triple stores, and on the other, visual query formulation has become more compelling due to ever-increasing data size and complexityâknown as Big Data. This article presents and argues for ontology-based visual query formulation for end-users; discusses its feasibility in terms of ontology-based data access, which virtualises legacy relational databases as RDF, and the dimensions of Big Data; presents key conceptual aspects and dimensions, challenges, and requirements; and reviews, categorises, and discusses notable approaches and systems
Efficient Ontology-Based Data Integration with Canonical IRIs
In this paper, we study how to efficiently integrate multiple relational databases using an ontology-based approach. In ontology-based data integration (OBDI) an ontology provides a coherent view of multiple databases, and SPARQL queries over the ontology are rewritten into (federated) SQL queries over the underlying databases. Specifically, we address the scenario where records with different identifiers in different databases can represent the same entity. The standard approach in this case is to use sameAs to model the equivalence between entities. However, the standard semantics of sameAs may cause an exponential blow up of query results, since all possible combinations of equivalent identifiers have to be included in the answers. The large number of answers is not only detrimental to the performance of query evaluation, but also makes the answers difficult to understand due to the redundancy they introduce. This motivates us to propose an alternative approach, which is based on assigning canonical IRIs to entities in order to avoid redundancy. Formally, we present our approach as a new SPARQL entailment regime and compare it with the sameAs approach. We provide a prototype implementation and evaluate it in two experiments: in a real-world data integration scenario in Statoil and in an experiment extending the Wisconsin benchmark. The experimental results show that the canonical IRI approach is significantly more scalable