135 research outputs found

    Correlation structures in applied probability

    Get PDF
    This thesis examines consequences of correlation structure in three areas of applied probability: mathematical population genetics, birth processes, and "exchangeable" measures on distributive lattices. The first three chapters concern probabilistic models in genetics. Initially we generalize the Moran model to allow more than one individual to reproduce per generation, investigating the effect of this on the behaviour of the model. The conclusion is that while things apparently happen faster, the basic properties are the same. This model also serves to unify conventional neutral theory, as it links the Moran model to the Wright-Fisher model. We then consider aspects of the neutral theory. Commonly a neutral model is supposed in which successive generations behave independently. This may well be unrealistic. Here we take the Moran model and adapt it to allow for correlations in offspring numbers between generations. An analysis of the model shows that the conditional distribution of allele frequencies is unchanged, although the expected number of alleles present decreases. Similar results are also obtained when correlation is introduced to the more general model with more than one reproducer per generation. In each case the approach involves a detailed study of the genealogy of the models. Next we consider the effect of correlation in Markov Birth Processes. We show that if the birth rates form a super(sub) linear sequence then the sizes of its families are positively(negatively) correlated. From this we prove a conjecture of Faddy which says that if the birth rates of a process X(t) are super(sub)-linear then the variance ratio V (t) (defined as VarX(t)/(EX(t)[EX(t)/X(0)-1])) is greater than (less than) 1. Finally we study correlation inequalities. The FKG Inequality is a well known result giving sufficient conditions for positive correlations in probability measures on distributive lattices. There are few analogous results concerning negative correlation. We give sufficient conditions for a particular form of negative correlation when the underlying distributions possess a certain exchangeability property

    Fast "coalescent" simulation

    Get PDF
    BACKGROUND: The amount of genome-wide molecular data is increasing rapidly, as is interest in developing methods appropriate for such data. There is a consequent increasing need for methods that are able to efficiently simulate such data. In this paper we implement the sequentially Markovian coalescent algorithm described by McVean and Cardin and present a further modification to that algorithm which slightly improves the closeness of the approximation to the full coalescent model. The algorithm ignores a class of recombination events known to affect the behavior of the genealogy of the sample, but which do not appear to affect the behavior of generated samples to any substantial degree. RESULTS: We show that our software is able to simulate large chromosomal regions, such as those appropriate in a consideration of genome-wide data, in a way that is several orders of magnitude faster than existing coalescent algorithms. CONCLUSION: This algorithm provides a useful resource for those needing to simulate large quantities of data for chromosomal-length regions using an approach that is much more efficient than traditional coalescent models

    Threshold Response to Stochasticity in Morphogenesis

    Full text link
    During development of biological organisms, multiple complex structures are formed. In many instances, these structures need to exhibit a high degree of order to be functional, although many of their constituents are intrinsically stochastic. Hence, it has been suggested that biological robustness ultimately must rely on complex gene regulatory networks and clean-up mechanisms. Here we explore developmental processes that have evolved inherent robustness against stochasticity. In the context of the Drosophila eye disc, multiple optical units, ommatidia, develop into crystal-like patterns. During the larva-to-pupa stage of metamorphosis, the centers of the ommatidia are specified initially through the diffusion of morphogens, followed by the specification of R8 cells. Establishing the R8 cell is crucial in setting up the geometric, and functional, relationships of cells within an ommatidium and among neighboring ommatidia. Here we study a mathematical model of these spatio-temporal processes in the presence of stochasticity, defining and applying measures that quantify order within the resulting spatial patterns. We observe a universal sigmoidal response to increasing transcriptional noise. Ordered patterns persist up to a threshold noise level in the model parameters. As the noise is further increased past a threshold point of no return, these ordered patterns rapidly become disordered. Such robustness in development allows for the accumulation of genetic variation without any observable changes in phenotype. We argue that the observed sigmoidal dependence introduces robustness allowing for sizable amounts of genetic variation and transcriptional noise to be tolerated in natural populations without resulting in phenotype variation

    Modeling measurement error in tumor characterization studies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Etiologic studies of cancer increasingly use molecular features such as gene expression, DNA methylation and sequence mutation to subclassify the cancer type. In large population-based studies, the tumor tissues available for study are archival specimens that provide variable amounts of amplifiable DNA for molecular analysis. As molecular features measured from small amounts of tumor DNA are inherently noisy, we propose a novel approach to improve statistical efficiency when comparing groups of samples. We illustrate the phenomenon using the MethyLight technology, applying our proposed analysis to compare <it>MLH1 </it>DNA methylation levels in males and females studied in the Colon Cancer Family Registry.</p> <p>Results</p> <p>We introduce two methods for computing empirical weights to model heteroscedasticity that is caused by sampling variable quantities of DNA for molecular analysis. In a simulation study, we show that using these weights in a linear regression model is more powerful for identifying differentially methylated loci than standard regression analysis. The increase in power depends on the underlying relationship between variation in outcome measure and input DNA quantity in the study samples.</p> <p>Conclusions</p> <p>Tumor characteristics measured from small amounts of tumor DNA are inherently noisy. We propose a statistical analysis that accounts for the measurement error due to sampling variation of the molecular feature and show how it can improve the power to detect differential characteristics between patient groups.</p

    Copy number variation in the Framingham Heart Study

    Get PDF
    In this paper we test for association between copy number variation and diabetes in a subset of individuals from the Framingham Heart Study. We used the 500 k SNP data and called copy number variation using two algorithms: the genome alteration detection algorithm of Pique-Regi et al. and the software Golden Helix. We then tested for association between copy number and diabetes using a gene-based analysis. Our results show little evidence of association between copy number and diabetes status. Furthermore, our results indicate a relatively poor level of agreement between copy number calls resulting from the two programs. We then examined potential causes for this difference in results and the implications for future studies
    corecore