18,021 research outputs found

    Some issues in data model mapping

    Get PDF
    Numerous data models have been reported in the literature since the early 1970's. They have been used as database interfaces and as conceptual design tools. The mapping between schemas expressed according to the same data model or according to different models is interesting for theoretical and practical purposes. This paper addresses some of the issues involved in such a mapping. Of special interest are the identification of the mapping parameters and some current approaches for handling the various situations that require a mapping

    Relational Algebra for In-Database Process Mining

    Get PDF
    The execution logs that are used for process mining in practice are often obtained by querying an operational database and storing the result in a flat file. Consequently, the data processing power of the database system cannot be used anymore for this information, leading to constrained flexibility in the definition of mining patterns and limited execution performance in mining large logs. Enabling process mining directly on a database - instead of via intermediate storage in a flat file - therefore provides additional flexibility and efficiency. To help facilitate this ideal of in-database process mining, this paper formally defines a database operator that extracts the 'directly follows' relation from an operational database. This operator can both be used to do in-database process mining and to flexibly evaluate process mining related queries, such as: "which employee most frequently changes the 'amount' attribute of a case from one task to the next". We define the operator using the well-known relational algebra that forms the formal underpinning of relational databases. We formally prove equivalence properties of the operator that are useful for query optimization and present time-complexity properties of the operator. By doing so this paper formally defines the necessary relational algebraic elements of a 'directly follows' operator, which are required for implementation of such an operator in a DBMS

    Reasoning about Independence in Probabilistic Models of Relational Data

    Full text link
    We extend the theory of d-separation to cases in which data instances are not independent and identically distributed. We show that applying the rules of d-separation directly to the structure of probabilistic models of relational data inaccurately infers conditional independence. We introduce relational d-separation, a theory for deriving conditional independence facts from relational models. We provide a new representation, the abstract ground graph, that enables a sound, complete, and computationally efficient method for answering d-separation queries about relational models, and we present empirical results that demonstrate effectiveness.Comment: 61 pages, substantial revisions to formalisms, theory, and related wor

    Datalog and Constraint Satisfaction with Infinite Templates

    Full text link
    On finite structures, there is a well-known connection between the expressive power of Datalog, finite variable logics, the existential pebble game, and bounded hypertree duality. We study this connection for infinite structures. This has applications for constraint satisfaction with infinite templates. If the template Gamma is omega-categorical, we present various equivalent characterizations of those Gamma such that the constraint satisfaction problem (CSP) for Gamma can be solved by a Datalog program. We also show that CSP(Gamma) can be solved in polynomial time for arbitrary omega-categorical structures Gamma if the input is restricted to instances of bounded treewidth. Finally, we characterize those omega-categorical templates whose CSP has Datalog width 1, and those whose CSP has strict Datalog width k.Comment: 28 pages. This is an extended long version of a conference paper that appeared at STACS'06. In the third version in the arxiv we have revised the presentation again and added a section that relates our results to formalizations of CSPs using relation algebra
    • …
    corecore