Search CORE

18 research outputs found

Data Integration: a Challenging ASP Application

Author: Bartosz Nowicki
Domenico Lembo
Edyta Kalka
Georg Gottlob
Gianluigi Greco
Giorgio Terracina
Giovambattista Ianni
Luigi Granata
Marco Ruzzi
Maurizio Lenzerini
Michael Fink
Nicola Leone
Riccardo Rosati
Thomas Eiter
Vincenzino Lio
Witold Staniszkis
Wolfgang Faber
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

Crossref

Oxford University Research Archive

Verification of Query Completeness over Processes [Extended Version]

Author: Montali Marco
Nutt Werner
Razniewski Simon
Publication venue
Publication date: 07/06/2013
Field of study

Data completeness is an essential aspect of data quality, and has in turn a huge impact on the effective management of companies. For example, statistics are computed and audits are conducted in companies by implicitly placing the strong assumption that the analysed data are complete. In this work, we are interested in studying the problem of completeness of data produced by business processes, to the aim of automatically assessing whether a given database query can be answered with complete information in a certain state of the process. We formalize so-called quality-aware processes that create data in the real world and store it in the company's information system possibly at a later point.Comment: Extended version of a paper that was submitted to BPM 201

arXiv.org e-Print Archive

CiteSeerX

A unified view of data-intensive flows in business intelligence systems : a survey

Author: Abelló Gamazo Alberto
Jovanovic Petar
Romero Moral Óscar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of today’s research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Coherent Integration of Databases by Abductive Logic Programming

Author: Arieli O.
Bruynooghe M.
Denecker M.
Van Nuffelen B.
Publication venue: 'AI Access Foundation'
Publication date: 01/01/2004
Field of study

We introduce an abductive method for a coherent integration of independent data-sources. The idea is to compute a list of data-facts that should be inserted to the amalgamated database or retracted from it in order to restore its consistency. This method is implemented by an abductive solver, called Asystem, that applies SLDNFA-resolution on a meta-theory that relates different, possibly contradicting, input databases. We also give a pure model-theoretic analysis of the possible ways to `recover' consistent data from an inconsistent database in terms of those models of the database that exhibit as minimal inconsistent information as reasonably possible. This allows us to characterize the `recovered databases' in terms of the `preferred' (i.e., most consistent) models of the theory. The outcome is an abductive-based application that is sound and complete with respect to a corresponding model-based, preferential semantics, and -- to the best of our knowledge -- is more expressive (thus more general) than any other implementation of coherent integration of databases

arXiv.org e-Print Archive

Lirias

Crossref

Inconsistency tolerance in P2P data integration: An epistemic logic approach

Author: Adjiman
Alchourrón
Bravo
Chomicki
De Giacomo
Diego Calvanese
Domenico Lembo
Donini
Fagin
Fagin
Fuxman
Giuseppe De Giacomo
Greco
Gärdenfors
Halpern
Hintikka
Hughes
Levesque
Lifschitz
Lin
Maurizio Lenzerini
Reiter
Riccardo Rosati
Rosati
Stalnaker
Wijsen
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Magic Sets for Disjunctive Datalog Programs

Author: Abiteboul
Arenas
Arenas
Baumgartner
Beeri
Behrend
Bertossi
Bertossi
Cadoli
Chomicki
Chomicki
Chomicki
Cumbo
Drescher
Eiter
Faber
Fuxman
Garey
Gebser
Gelfond
Gianluigi Greco
Goldman
Greco
Greco
Greco
Hustadt
Janhunen
Kemp
Lee
Leone
Leone
Leone
Liang
Lierler
Lin
Lobo
Manna
Manna
Mario Alviano
Motik
Nicola Leone
Ramakrishnan
Ricca
Ross
Seshadri
Simons
Stuckey
Ullman
Wolfgang Faber
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

In this paper, a new technique for the optimization of (partially) bound queries over disjunctive Datalog programs with stratified negation is presented. The technique exploits the propagation of query bindings and extends the Magic Set (MS) optimization technique. An important feature of disjunctive Datalog is nonmonotonicity, which calls for nondeterministic implementations, such as backtracking search. A distinguishing characteristic of the new method is that the optimization can be exploited also during the nondeterministic phase. In particular, after some assumptions have been made during the computation, parts of the program may become irrelevant to a query under these assumptions. This allows for dynamic pruning of the search space. In contrast, the effect of the previously defined MS methods for disjunctive Datalog is limited to the deterministic portion of the process. In this way, the potential performance gain by using the proposed method can be exponential, as could be observed empirically. The correctness of MS is established thanks to a strong relationship between MS and unfounded sets that has not been studied in the literature before. This knowledge allows for extending the method also to programs with stratified negation in a natural way. The proposed method has been implemented in DLV and various experiments have been conducted. Experimental results on synthetic data confirm the utility of MS for disjunctive Datalog, and they highlight the computational gain that may be obtained by the new method w.r.t. the previously proposed MS methods for disjunctive Datalog programs. Further experiments on real-world data show the benefits of MS within an application scenario that has received considerable attention in recent years, the problem of answering user queries over possibly inconsistent databases originating from integration of autonomous sources of information.Comment: 67 pages, 19 figures, preprint submitted to Artificial Intelligenc

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

Crossref

University of Huddersfield Repository