Search CORE

2,808 research outputs found

Matching Dependencies with Arbitrary Attribute Values: Semantics, Query Answering and Integrity Constraints

Author: Bertossi Leopoldo
Gardezi Jaffer
Kiringa Iluju
Publication venue
Publication date: 26/08/2010
Field of study

Matching dependencies (MDs) were introduced to specify the identification or matching of certain attribute values in pairs of database tuples when some similarity conditions are satisfied. Their enforcement can be seen as a natural generalization of entity resolution. In what we call the "pure case" of MDs, any value from the underlying data domain can be used for the value in common that does the matching. We investigate the semantics and properties of data cleaning through the enforcement of matching dependencies for the pure case. We characterize the intended clean instances and also the "clean answers" to queries as those that are invariant under the cleaning process. The complexity of computing clean instances and clean answers to queries is investigated. Tractable and intractable cases depending on the MDs and queries are identified. Finally, we establish connections with database "repairs" under integrity constraints.Comment: 13 pages, double column, 2 figure

arXiv.org e-Print Archive

CiteSeerX

Carleton University's Institutional Repository

Measuring discord among multidimensional data sources

Author: Abelló Gamazo Alberto
Cheney James
Publication venue: CEUR-WS.org
Publication date: 01/01/2022
Field of study

Data integration is a classical problem in databases, typically decomposed into schema matching, entity matching and record merging. To solve the latter, it is mostly assumed that ground truth can be determined, either as master data or from user feedback. However, in many cases, this is not the case because firstly the merging processes cannot be accurate enough, and also the data gathering processes in the different sources are simply imperfect and cannot provide high quality data. Instead of enforcing consistency, we propose to evaluate how concordant or discordant sources are as a measure of trustworthiness (the more discordant are the sources, the less we can trust their data). Thus, we define the discord measurement problem in which given a set of uncertain raw observations or aggregate results (such as case/hospitalization/death data relevant to COVID-19) and information on the alignment of different data (for example, cases and deaths), we wish to assess whether the different sources are concordant, or if not, measure how discordant they are.The work of Alberto Abelló has been done under project PID2020- 117191RB-I00 funded by MCIN/ AEI /10.13039/501100011033. The work of James Cheney was supported by ERC Consolidator Grant Skye (grant number 682315).Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Inconsistency-tolerant Query Answering in Ontology-based Data Access

Author: LEMBO Domenico
LENZERINI Maurizio
ROSATI Riccardo
RUZZI MARCO
SAVO Domenico Fabio
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

Ontology-based data access (OBDA) is receiving great attention as a new paradigm for managing information systems through semantic technologies. According to this paradigm, a Description Logic ontology provides an abstract and formal representation of the domain of interest to the information system, and is used as a sophisticated schema for accessing the data and formulating queries over them. In this paper, we address the problem of dealing with inconsistencies in OBDA. Our general goal is both to study DL semantical frameworks that are inconsistency-tolerant, and to devise techniques for answering unions of conjunctive queries under such inconsistency-tolerant semantics. Our work is inspired by the approaches to consistent query answering in databases, which are based on the idea of living with inconsistencies in the database, but trying to obtain only consistent information during query answering, by relying on the notion of database repair. We first adapt the notion of database repair to our context, and show that, according to such a notion, inconsistency-tolerant query answering is intractable, even for very simple DLs. Therefore, we propose a different repair-based semantics, with the goal of reaching a good compromise between the expressive power of the semantics and the computational complexity of inconsistency-tolerant query answering. Indeed, we show that query answering under the new semantics is first-order rewritable in OBDA, even if the ontology is expressed in one of the most expressive members of the DL-Lite family

Archivio della ricerca- Università di Roma La Sapienza

Repairing Inconsistent XML Write-Access Control Policies

Author: Bravo Loreto
Cheney James
Fundulaki Irini
Publication venue
Publication date: 01/01/2007
Field of study

XML access control policies involving updates may contain security flaws, here called inconsistencies, in which a forbidden operation may be simulated by performing a sequence of allowed operations. This paper investigates the problem of deciding whether a policy is consistent, and if not, how its inconsistencies can be repaired. We consider policies expressed in terms of annotated DTDs defining which operations are allowed or denied for the XML trees that are instances of the DTD. We show that consistency is decidable in PTIME for such policies and that consistent partial policies can be extended to unique "least-privilege" consistent total policies. We also consider repair problems based on deleting privileges to restore consistency, show that finding minimal repairs is NP-complete, and give heuristics for finding repairs.Comment: 25 pages. To appear in Proceedings of DBPL 200

arXiv.org e-Print Archive

CiteSeerX