78 research outputs found

    Genetic algorithm learning as a robust approach to RNA editing site prediction

    Get PDF
    BACKGROUND: RNA editing is one of several post-transcriptional modifications that may contribute to organismal complexity in the face of limited gene complement in a genome. One form, known as C → U editing, appears to exist in a wide range of organisms, but most instances of this form of RNA editing have been discovered serendipitously. With the large amount of genomic and transcriptomic data now available, a computational analysis could provide a more rapid means of identifying novel sites of C → U RNA editing. Previous efforts have had some success but also some limitations. We present a computational method for identifying C → U RNA editing sites in genomic sequences that is both robust and generalizable. We evaluate its potential use on the best data set available for these purposes: C → U editing sites in plant mitochondrial genomes. RESULTS: Our method is derived from a machine learning approach known as a genetic algorithm. REGAL (RNA Editing site prediction by Genetic Algorithm Learning) is 87% accurate when tested on three mitochondrial genomes, with an overall sensitivity of 82% and an overall specificity of 91%. REGAL's performance significantly improves on other ab initio approaches to predicting RNA editing sites in this data set. REGAL has a comparable sensitivity and higher specificity than approaches which rely on sequence homology, and it has the advantage that strong sequence conservation is not required for reliable prediction of edit sites. CONCLUSION: Our results suggest that ab initio methods can generate robust classifiers of putative edit sites, and we highlight the value of combinatorial approaches as embodied by genetic algorithms. We present REGAL as one approach with the potential to be generalized to other organisms exhibiting C → U RNA editing

    The ethics of uncertainty for data subjects

    Get PDF
    Modern health data practices come with many practical uncertainties. In this paper, I argue that data subjects’ trust in the institutions and organizations that control their data, and their ability to know their own moral obligations in relation to their data, are undermined by significant uncertainties regarding the what, how, and who of mass data collection and analysis. I conclude by considering how proposals for managing situations of high uncertainty might be applied to this problem. These emphasize increasing organizational flexibility, knowledge, and capacity, and reducing hazard

    The polarity coincidence correlator: Significance testing and other issues

    No full text

    Models for strawberry inflorescence data

    No full text
    The flowers of strawberry plants grow on very variable branched structures called inflorescences, in which each branch gives rise to 0, 1, or 2 offspring branches. We extend previous modeling of the number of strawberry flowers at each individual level in the inflorescence structure conditional on the number of strawberry flowers at the previous level. We consider a range of logistic regression models, including models that incorporate inflorescence effects and random effects. The models can be used to summarize the overall structure of any particular variety and to indicate the main differences between varieties. For the data of the article, we show that models based on convolutions of correlated Bernoulli random variables outperform binomial regression models
    • …
    corecore