10,702 research outputs found

    A Uniform Dependency Language for Improving Data Quality

    Get PDF

    Identifying statistical dependence in genomic sequences via mutual information estimates

    Get PDF
    Questions of understanding and quantifying the representation and amount of information in organisms have become a central part of biological research, as they potentially hold the key to fundamental advances. In this paper, we demonstrate the use of information-theoretic tools for the task of identifying segments of biomolecules (DNA or RNA) that are statistically correlated. We develop a precise and reliable methodology, based on the notion of mutual information, for finding and extracting statistical as well as structural dependencies. A simple threshold function is defined, and its use in quantifying the level of significance of dependencies between biological segments is explored. These tools are used in two specific applications. First, for the identification of correlations between different parts of the maize zmSRp32 gene. There, we find significant dependencies between the 5' untranslated region in zmSRp32 and its alternatively spliced exons. This observation may indicate the presence of as-yet unknown alternative splicing mechanisms or structural scaffolds. Second, using data from the FBI's Combined DNA Index System (CODIS), we demonstrate that our approach is particularly well suited for the problem of discovering short tandem repeats, an application of importance in genetic profiling.Comment: Preliminary version. Final version in EURASIP Journal on Bioinformatics and Systems Biology. See http://www.hindawi.com/journals/bsb

    Reasoning about Record Matching Rules

    Get PDF

    Guided Interaction Exploration in Artifact-centric Process Models

    Get PDF
    Artifact-centric process models aim to describe complex processes as a collection of interacting artifacts. Recent development in process mining allow for the discovery of such models. However, the focus is often on the representation of the individual artifacts rather than their interactions. Based on event data we can automatically discover composite state machines representing artifact-centric processes. Moreover, we provide ways of visualizing and quantifying interactions among different artifacts. For example, we are able to highlight strongly correlated behaviours in different artifacts. The approach has been fully implemented as a ProM plug-in; the CSM Miner provides an interactive artifact-centric process discovery tool focussing on interactions. The approach has been evaluated using real life data sets, including the personal loan and overdraft process of a Dutch financial institution.Comment: 10 pages, 4 figures, to be published in proceedings of the 19th IEEE Conference on Business Informatics, CBI 201

    Domino: exploring mobile collaborative software adaptation

    Get PDF
    Social Proximity Applications (SPAs) are a promising new area for ubicomp software that exploits the everyday changes in the proximity of mobile users. While a number of applications facilitate simple file sharing between co–present users, this paper explores opportunities for recommending and sharing software between users. We describe an architecture that allows the recommendation of new system components from systems with similar histories of use. Software components and usage histories are exchanged between mobile users who are in proximity with each other. We apply this architecture in a mobile strategy game in which players adapt and upgrade their game using components from other players, progressing through the game through sharing tools and history. More broadly, we discuss the general application of this technique as well as the security and privacy challenges to such an approach
    corecore