30,144 research outputs found

    Explicit probabilistic models for databases and networks

    Full text link
    Recent work in data mining and related areas has highlighted the importance of the statistical assessment of data mining results. Crucial to this endeavour is the choice of a non-trivial null model for the data, to which the found patterns can be contrasted. The most influential null models proposed so far are defined in terms of invariants of the null distribution. Such null models can be used by computation intensive randomization approaches in estimating the statistical significance of data mining results. Here, we introduce a methodology to construct non-trivial probabilistic models based on the maximum entropy (MaxEnt) principle. We show how MaxEnt models allow for the natural incorporation of prior information. Furthermore, they satisfy a number of desirable properties of previously introduced randomization approaches. Lastly, they also have the benefit that they can be represented explicitly. We argue that our approach can be used for a variety of data types. However, for concreteness, we have chosen to demonstrate it in particular for databases and networks.Comment: Submitte

    A cookbook for temporal conceptual data modelling with description logic

    Get PDF
    We design temporal description logics suitable for reasoning about temporal conceptual data models and investigate their computational complexity. Our formalisms are based on DL-Lite logics with three types of concept inclusions (ranging from atomic concept inclusions and disjointness to the full Booleans), as well as cardinality constraints and role inclusions. In the temporal dimension, they capture future and past temporal operators on concepts, flexible and rigid roles, the operators `always' and `some time' on roles, data assertions for particular moments of time and global concept inclusions. The logics are interpreted over the Cartesian products of object domains and the flow of time (Z,<), satisfying the constant domain assumption. We prove that the most expressive of our temporal description logics (which can capture lifespan cardinalities and either qualitative or quantitative evolution constraints) turn out to be undecidable. However, by omitting some of the temporal operators on concepts/roles or by restricting the form of concept inclusions we obtain logics whose complexity ranges between PSpace and NLogSpace. These positive results were obtained by reduction to various clausal fragments of propositional temporal logic, which opens a way to employ propositional or first-order temporal provers for reasoning about temporal data models

    Prioritized Repairing and Consistent Query Answering in Relational Databases

    Get PDF
    A consistent query answer in an inconsistent database is an answer obtained in every (minimal) repair. The repairs are obtained by resolving all conflicts in all possible ways. Often, however, the user is able to provide a preference on how conflicts should be resolved. We investigate here the framework of preferred consistent query answers, in which user preferences are used to narrow down the set of repairs to a set of preferred repairs. We axiomatize desirable properties of preferred repairs. We present three different families of preferred repairs and study their mutual relationships. Finally, we investigate the complexity of preferred repairing and computing preferred consistent query answers.Comment: Accepted to the special SUM'08 issue of AMA
    • 

    corecore