Search CORE

4,164 research outputs found

The Bag Semantics of Ontology-Based Data Access

Author: Grau Bernardo Cuenca
Horrocks Ian
Kaminski Mark
Konstantinidis George
Kostylev Egor V.
Nikolaou Charalampos
Publication venue
Publication date: 01/01/2017
Field of study

Ontology-based data access (OBDA) is a popular approach for integrating and querying multiple data sources by means of a shared ontology. The ontology is linked to the sources using mappings, which assign views over the data to ontology predicates. Motivated by the need for OBDA systems supporting database-style aggregate queries, we propose a bag semantics for OBDA, where duplicate tuples in the views defined by the mappings are retained, as is the case in standard databases. We show that bag semantics makes conjunctive query answering in OBDA coNP-hard in data complexity. To regain tractability, we consider a rather general class of queries and show its rewritability to a generalisation of the relational calculus to bags

arXiv.org e-Print Archive

Crossref

Southampton (e-Prints Soton)

Oxford University Research Archive

Efficient querying of inconsistent databases with binary integer programming

Author: Arenas M.
Arenas M.
Arenas M.
Barceló P.
Beeri C.
Bertossi L.
Binnig C.
Caniupán M.
Caniupán M.
Chomicki J.
Chomicki J.
Chomicki J.
Eiter T.
Elmagarmid A. K.
Flesca S.
Flesca S.
Fuxman A.
Fuxman A.
Garey M. R.
Greco G.
Greco G.
Kolaitis P. G.
Leone N.
Nieuwenborgh D. V.
Wijsen J.
Wijsen J.
Wijsen J.
Publication venue: 'VLDB Endowment'
Publication date
Field of study

Crossref

Certain Answers of Extensions of Conjunctive Queries by Datalog and First-Order Rewriting

Author: Gheerbrant Amélie
Libkin Leonid
Rogova Alexandra
Sirangelo Cristina
Publication venue
Publication date: 05/09/2022
Field of study

International audienc

INRIA a CCSD electronic archive server

Edinburgh Research Explorer

Hal-Diderot

Get the Most out of Your Sample: Optimal Unbiased Estimators using Partial Information

Author: Cohen Edith
Kaplan Haim
Publication venue
Publication date: 01/01/2011
Field of study

Random sampling is an essential tool in the processing and transmission of data. It is used to summarize data too large to store or manipulate and meet resource constraints on bandwidth or battery power. Estimators that are applied to the sample facilitate fast approximate processing of queries posed over the original data and the value of the sample hinges on the quality of these estimators. Our work targets data sets such as request and traffic logs and sensor measurements, where data is repeatedly collected over multiple {\em instances}: time periods, locations, or snapshots. We are interested in queries that span multiple instances, such as distinct counts and distance measures over selected records. These queries are used for applications ranging from planning to anomaly and change detection. Unbiased low-variance estimators are particularly effective as the relative error decreases with the number of selected record keys. The Horvitz-Thompson estimator, known to minimize variance for sampling with "all or nothing" outcomes (which reveals exacts value or no information on estimated quantity), is not optimal for multi-instance operations for which an outcome may provide partial information. We present a general principled methodology for the derivation of (Pareto) optimal unbiased estimators over sampled instances and aim to understand its potential. We demonstrate significant improvement in estimate accuracy of fundamental queries for common sampling schemes.Comment: This is a full version of a PODS 2011 pape

arXiv.org e-Print Archive

CiteSeerX