Search CORE

18,021 research outputs found

Some issues in data model mapping

Author: Alsabbagh Jamal R.
Dominick Wayne D.
Publication venue
Publication date
Field of study

Numerous data models have been reported in the literature since the early 1970's. They have been used as database interfaces and as conceptual design tools. The mapping between schemas expressed according to the same data model or according to different models is interesting for theoretical and practical purposes. This paper addresses some of the issues involved in such a mapping. Of special interest are the identification of the mapping parameters and some current approaches for handling the various situations that require a mapping

NASA Technical Reports Server

Recommended from our members

Structure identification in relational data

Author: Dechter Rina
Pearl Judea
Publication venue: eScholarship, University of California
Publication date: 08/07/1992
Field of study

This paper presents several investigations into the prospects for identifying meaningful structures in empirical data, namely, structures permitting effective organization of the data to meet requirements of future queries. We propose a general framework whereby the notion of identifiability is given a precise formal definition similar to that of learnability. Using this framework, we then explore if a tractable procedure exists for deciding whether a given relation is decomposable into a constraint network or a CNF theory with desirable topology and, if the answer is positive, identifying the desired decomposition. Finally, we address the problem of expressing a given relation as a Horn theory and, if this is impossible, finding the best k-Horn approximation to the given relation. We show that both problems can be solved in time polynomial in the length of the data

eScholarship - University of California

Relational Algebra for In-Database Process Mining

Author: Dijkman Remco
Gao Juntao
Grefen Paul
ter Hofstede Arthur
Publication venue
Publication date: 26/06/2017
Field of study

The execution logs that are used for process mining in practice are often obtained by querying an operational database and storing the result in a flat file. Consequently, the data processing power of the database system cannot be used anymore for this information, leading to constrained flexibility in the definition of mining patterns and limited execution performance in mining large logs. Enabling process mining directly on a database - instead of via intermediate storage in a flat file - therefore provides additional flexibility and efficiency. To help facilitate this ideal of in-database process mining, this paper formally defines a database operator that extracts the 'directly follows' relation from an operational database. This operator can both be used to do in-database process mining and to flexibly evaluate process mining related queries, such as: "which employee most frequently changes the 'amount' attribute of a case from one task to the next". We define the operator using the well-known relational algebra that forms the formal underpinning of relational databases. We formally prove equivalence properties of the operator that are useful for query optimization and present time-complexity properties of the operator. By doing so this paper formally defines the necessary relational algebraic elements of a 'directly follows' operator, which are required for implementation of such an operator in a DBMS

arXiv.org e-Print Archive

Pure OAI Repository

Reasoning about Independence in Probabilistic Models of Relational Data

Author: Jensen David
Maier Marc
Marazopoulou Katerina
Publication venue
Publication date: 06/01/2014
Field of study

We extend the theory of d-separation to cases in which data instances are not independent and identically distributed. We show that applying the rules of d-separation directly to the structure of probabilistic models of relational data inaccurately infers conditional independence. We introduce relational d-separation, a theory for deriving conditional independence facts from relational models. We provide a new representation, the abstract ground graph, that enables a sound, complete, and computationally efficient method for answering d-separation queries about relational models, and we present empirical results that demonstrate effectiveness.Comment: 61 pages, substantial revisions to formalisms, theory, and related wor

arXiv.org e-Print Archive

CiteSeerX

Datalog and Constraint Satisfaction with Infinite Templates

Author: Achlioptas
Bodirsky
Bodirsky
Bodirsky
Bodirsky
Bodirsky
Bulatov
Bulatov
Cameron
Cherlin
Covington
Cristiani
Duentsch
Ebbinghaus
Feder
Grohe
Hell
Hirsch
Hodges
Jeavons
Jeavons
Kolaitis
Kun
Ladkin
Larose
Larose
Lyndon
Madelaine
Madelaine
Manuel Bodirsky
Nešetřil
Rossman
van Leeuwen
Víctor Dalmau
Publication venue
Publication date: 15/04/2012
Field of study

On finite structures, there is a well-known connection between the expressive power of Datalog, finite variable logics, the existential pebble game, and bounded hypertree duality. We study this connection for infinite structures. This has applications for constraint satisfaction with infinite templates. If the template Gamma is omega-categorical, we present various equivalent characterizations of those Gamma such that the constraint satisfaction problem (CSP) for Gamma can be solved by a Datalog program. We also show that CSP(Gamma) can be solved in polynomial time for arbitrary omega-categorical structures Gamma if the input is restricted to instances of bounded treewidth. Finally, we characterize those omega-categorical templates whose CSP has Datalog width 1, and those whose CSP has strict Datalog width k.Comment: 28 pages. This is an extended long version of a conference paper that appeared at STACS'06. In the third version in the arxiv we have revised the presentation again and added a section that relates our results to formalizations of CSPs using relation algebra

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Crossref

UPF Digital Repository

HAL-Polytechnique