Search CORE

36,905 research outputs found

Using Ontologies for Semantic Data Integration

Author: DE GIACOMO Giuseppe
Lembo Domenico
Lenzerini Maurizio
Poggi Antonella
Rosati Riccardo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

While big data analytics is considered as one of the most important paths to competitive advantage of today’s enterprises, data scientists spend a comparatively large amount of time in the data preparation and data integration phase of a big data project. This shows that data integration is still a major challenge in IT applications. Over the past two decades, the idea of using semantics for data integration has become increasingly crucial, and has received much attention in the AI, database, web, and data mining communities. Here, we focus on a specific paradigm for semantic data integration, called Ontology-Based Data Access (OBDA). The goal of this paper is to provide an overview of OBDA, pointing out both the techniques that are at the basis of the paradigm, and the main challenges that remain to be addressed

Archivio della ricerca- Università di Roma La Sapienza

Secure Querying of Recursive XML Views: A Standard XPath-based Technique

Author: Imine Abdessamad
Mahfoud Houari
Publication venue
Publication date: 12/12/2011
Field of study

Most state-of-the art approaches for securing XML documents allow users to access data only through authorized views defined by annotating an XML grammar (e.g. DTD) with a collection of XPath expressions. To prevent improper disclosure of confidential information, user queries posed on these views need to be rewritten into equivalent queries on the underlying documents. This rewriting enables us to avoid the overhead of view materialization and maintenance. A major concern here is that query rewriting for recursive XML views is still an open problem. To overcome this problem, some works have been proposed to translate XPath queries into non-standard ones, called Regular XPath queries. However, query rewriting under Regular XPath can be of exponential size as it relies on automaton model. Most importantly, Regular XPath remains a theoretical achievement. Indeed, it is not commonly used in practice as translation and evaluation tools are not available. In this paper, we show that query rewriting is always possible for recursive XML views using only the expressive power of the standard XPath. We investigate the extension of the downward class of XPath, composed only by child and descendant axes, with some axes and operators and we propose a general approach to rewrite queries under recursive XML views. Unlike Regular XPath-based works, we provide a rewriting algorithm which processes the query only over the annotated DTD grammar and which can run in linear time in the size of the query. An experimental evaluation demonstrates that our algorithm is efficient and scales well.Comment: (2011

arXiv.org e-Print Archive

CiteSeerX

HAL - Université de Franche-Comté

INRIA a CCSD electronic archive server

Ontology-based data access with databases: a short course

Author: A. Artale
A. Calì
A. Calì
A. Chortaras
A. Poggi
A. Razborov
B. Glimm
C. Lutz
C.E. Shannon
D. Calvanese
D. Calvanese
J. Dolby
J.W. Lloyd
M. Garey
M. Grigni
M. König
M. Ortiz
N. Alon
R. Raz
R. Rosati
S. Arora
S. Kikot
T. Eiter
U. Hustadt
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Ontology-based data access (OBDA) is regarded as a key ingredient of the new generation of information systems. In the OBDA paradigm, an ontology defines a high-level global schema of (already existing) data sources and provides a vocabulary for user queries. An OBDA system rewrites such queries and ontologies into the vocabulary of the data sources and then delegates the actual query evaluation to a suitable query answering system such as a relational database management system or a datalog engine. In this chapter, we mainly focus on OBDA with the ontology language OWL 2QL, one of the three profiles of the W3C standard Web Ontology Language OWL 2, and relational databases, although other possible languages will also be discussed. We consider different types of conjunctive query rewriting and their succinctness, different architectures of OBDA systems, and give an overview of the OBDA system Ontop

CiteSeerX

Crossref

Birkbeck Institutional Research Online

On the Evaluation of RDF Distribution Algorithms Implemented over Apache Spark

Author: Amann Bernd
Baazizi Mohamed-Amine
Curé Olivier
Naacke Hubert
Publication venue
Publication date: 08/07/2015
Field of study

Querying very large RDF data sets in an efficient manner requires a sophisticated distribution strategy. Several innovative solutions have recently been proposed for optimizing data distribution with predefined query workloads. This paper presents an in-depth analysis and experimental comparison of five representative and complementary distribution approaches. For achieving fair experimental results, we are using Apache Spark as a common parallel computing framework by rewriting the concerned algorithms using the Spark API. Spark provides guarantees in terms of fault tolerance, high availability and scalability which are essential in such systems. Our different implementations aim to highlight the fundamental implementation-independent characteristics of each approach in terms of data preparation, load balancing, data replication and to some extent to query answering cost and performance. The presented measures are obtained by testing each system on one synthetic and one real-world data set over query workloads with differing characteristics and different partitioning constraints.Comment: 16 pages, 3 figure

arXiv.org e-Print Archive

HAL Descartes

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

The Bag Semantics of Ontology-Based Data Access

Author: Grau Bernardo Cuenca
Horrocks Ian
Kaminski Mark
Konstantinidis George
Kostylev Egor V.
Nikolaou Charalampos
Publication venue
Publication date: 01/01/2017
Field of study

Ontology-based data access (OBDA) is a popular approach for integrating and querying multiple data sources by means of a shared ontology. The ontology is linked to the sources using mappings, which assign views over the data to ontology predicates. Motivated by the need for OBDA systems supporting database-style aggregate queries, we propose a bag semantics for OBDA, where duplicate tuples in the views defined by the mappings are retained, as is the case in standard databases. We show that bag semantics makes conjunctive query answering in OBDA coNP-hard in data complexity. To regain tractability, we consider a rather general class of queries and show its rewritability to a generalisation of the relational calculus to bags

arXiv.org e-Print Archive

Crossref

Southampton (e-Prints Soton)

Oxford University Research Archive

XML Reconstruction View Selection in XML Databases: Complexity Analysis and Approximation Scheme

Author: A. Balmin
A. Chebotko
D. Florescu
D. Kossmann
H. Gupta
H. Gupta
H.V. Jagadish
M. Atay
M.R. Garey
R. Chirkova
S. Abiteboul
S. Chaudhuri
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Query evaluation in an XML database requires reconstructing XML subtrees rooted at nodes found by an XML query. Since XML subtree reconstruction can be expensive, one approach to improve query response time is to use reconstruction views - materialized XML subtrees of an XML document, whose nodes are frequently accessed by XML queries. For this approach to be efficient, the principal requirement is a framework for view selection. In this work, we are the first to formalize and study the problem of XML reconstruction view selection. The input is a tree

T

, in which every node

i

has a size

c_i

and profit

p_i

, and the size limitation

C

. The target is to find a subset of subtrees rooted at nodes

i_1,\cdots, i_k

respectively such that

c_{i_1}+\cdots +c_{i_k}\le C

, and

p_{i_1}+\cdots +p_{i_k}

is maximal. Furthermore, there is no overlap between any two subtrees selected in the solution. We prove that this problem is NP-hard and present a fully polynomial-time approximation scheme (FPTAS) as a solution

arXiv.org e-Print Archive

Crossref