614 research outputs found

    Effect of selection on ancestry: an exactly soluble case and its phenomenological generalization

    Full text link
    We consider a family of models describing the evolution under selection of a population whose dynamics can be related to the propagation of noisy traveling waves. For one particular model, that we shall call the exponential model, the properties of the traveling wave front can be calculated exactly, as well as the statistics of the genealogy of the population. One striking result is that, for this particular model, the genealogical trees have the same statistics as the trees of replicas in the Parisi mean-field theory of spin glasses. We also find that in the exponential model, the coalescence times along these trees grow like the logarithm of the population size. A phenomenological picture of the propagation of wave fronts that we introduced in a previous work, as well as our numerical data, suggest that these statistics remain valid for a larger class of models, while the coalescence times grow like the cube of the logarithm of the population size.Comment: 26 page

    The Time Machine: A Simulation Approach for Stochastic Trees

    Full text link
    In the following paper we consider a simulation technique for stochastic trees. One of the most important areas in computational genetics is the calculation and subsequent maximization of the likelihood function associated to such models. This typically consists of using importance sampling (IS) and sequential Monte Carlo (SMC) techniques. The approach proceeds by simulating the tree, backward in time from observed data, to a most recent common ancestor (MRCA). However, in many cases, the computational time and variance of estimators are often too high to make standard approaches useful. In this paper we propose to stop the simulation, subsequently yielding biased estimates of the likelihood surface. The bias is investigated from a theoretical point of view. Results from simulation studies are also given to investigate the balance between loss of accuracy, saving in computing time and variance reduction.Comment: 22 Pages, 5 Figure

    Evolution of the most recent common ancestor of a population with no selection

    Full text link
    We consider the evolution of a population of fixed size with no selection. The number of generations GG to reach the first common ancestor evolves in time. This evolution can be described by a simple Markov process which allows one to calculate several characteristics of the time dependence of GG. We also study how GG is correlated to the genetic diversity.Comment: 21 pages, 10 figures, uses RevTex4 and feynmf.sty Corrections : introduction and conclusion rewritten, references adde

    The cost of reducing starting RNA quantity for Illumina BeadArrays: a bead-level dilution experiment.

    Get PDF
    BACKGROUND: The demands of microarray expression technologies for quantities of RNA place a limit on the questions they can address. As a consequence, the RNA requirements have reduced over time as technologies have improved. In this paper we investigate the costs of reducing the starting quantity of RNA for the Illumina BeadArray platform. This we do via a dilution data set generated from two reference RNA sources that have become the standard for investigations into microarray and sequencing technologies. RESULTS: We find that the starting quantity of RNA has an effect on observed intensities despite the fact that the quantity of cRNA being hybridized remains constant. We see a loss of sensitivity when using lower quantities of RNA, but no great rise in the false positive rate. Even with 10 ng of starting RNA, the positive results are reliable although many differentially expressed genes are missed. We see that there is some scope for combining data from samples that have contributed differing quantities of RNA, but note also that sample sizes should increase to compensate for the loss of signal-to-noise when using low quantities of starting RNA. CONCLUSIONS: The BeadArray platform maintains a low false discovery rate even when small amounts of starting RNA are used. In contrast, the sensitivity of the platform drops off noticeably over the same range. Thus, those conducting experiments should not opt for low quantities of starting RNA without consideration of the costs of doing so. The implications for experimental design, and the integration of data from different starting quantities, are complex.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are

    Spike-in validation of an Illumina-specific variance-stabilizing transformation

    Get PDF
    BACKGROUND: Variance-stabilizing techniques have been used for some time in the analysis of gene expression microarray data. A new adaptation, the variance-stabilizing transformation (VST), has recently been developed to take advantage of the unique features of Illumina BeadArrays. VST has been shown to perform well in comparison with the widely-used approach of taking a log2 transformation, but has not been validated on a spike-in experiment. We apply VST to the data from a recently published spike-in experiment and compare it both to a regular log2 analysis and a recently recommended analysis that can be applied if all raw data are available. FINDINGS: VST provides more power to detect differentially expressed genes than a log2 transformation. However, the gain in power is roughly the same as utilizing the raw data from an experiment and weighting observations accordingly. VST is still advantageous when large changes in expression are anticipated, while a weighted log2 approach performs better for smaller changes. CONCLUSION: VST can be recommended for summarized Illumina data regardless of which Illumina pre-processing options have been used. However, using the raw data is still encouraged whenever possible

    Numbers of mutations to different types of colorectal cancer

    Get PDF
    BACKGROUND: The numbers of oncogenic mutations required for transformation are uncertain but may be inferred from how cancer frequencies increase with aging. Cancers requiring more mutations will tend to appear later in life. This type of approach may be confounded by biologic heterogeneity because different cancer subtypes may require different numbers of mutations. For example, a sporadic cancer should require at least one more somatic mutation relative to its hereditary counterpart. METHODS: To better estimate numbers of mutations before transformation, 1,022 colorectal cancers were classified with respect to microsatellite instability (MSI) and germline DNA mismatch repair mutations characteristic of hereditary nonpolyposis colorectal cancer (HNPCC). MSI- cancers were also classified with respect to clinical stage. Ages at cancer and a Bayesian algorithm were used to estimate the numbers of oncogenic mutations required for transformation for each cancer subtype. RESULTS: Ages at MSI+ cancers were consistent with five or six oncogenic mutations for hereditary (HNPCC) cancers, and seven or eight mutations for its sporadic counterpart. Ages at cancer were consistent with seven mutations for sporadic MSI- cancers, and were similar (six to eight mutations) regardless of clinical cancer stage. CONCLUSION: Different biologic subtypes of colorectal cancer appear to require different numbers of oncogenic mutations before transformation. Sporadic MSI+ cancers may require more than a single additional somatic alteration compared to hereditary MSI+ cancers because the epigenetic inactivation of MLH1 commonly observed in sporadic MSI+ cancers may be a multistep process. Interestingly, estimated numbers of MSI- cancer mutations were similar (six to eight mutations) regardless of clinical cancer stage, suggesting a propensity to spread or metastasize does not require additional mutations after transformation. Estimates of oncogenic mutation numbers may help explain some of the biology underlying different cancer subtypes

    Noisy traveling waves: effect of selection on genealogies

    Full text link
    For a family of models of evolving population under selection, which can be described by noisy traveling wave equations, the coalescence times along the genealogical tree scale like logαN\log^\alpha N, where NN is the size of the population, in contrast with neutral models for which they scale like NN. An argument relating this time scale to the diffusion constant of the noisy traveling wave leads to a prediction for α\alpha which agrees with our simulations. An exactly soluble case gives trees with statistics identical to those predicted for mean-field spin glasses in Parisi's theory.Comment: 4 pages, 2 figures New version includes more numerical simulations and some rewriting of the text presenting our result

    Transcriptional dynamics elicited by a short pulse of notch activation involves feed-forward regulation by E(spl)/Hes genes.

    Get PDF
    Dynamic activity of signaling pathways, such as Notch, is vital to achieve correct development and homeostasis. However, most studies assess output many hours or days after initiation of signaling, once the outcome has been consolidated. Here we analyze genome-wide changes in transcript levels, binding of the Notch pathway transcription factor, CSL [Suppressor of Hairless, Su(H), in Drosophila], and RNA Polymerase II (Pol II) immediately following a short pulse of Notch stimulation. A total of 154 genes showed significant differential expression (DE) over time, and their expression profiles stratified into 14 clusters based on the timing, magnitude, and direction of DE. E(spl) genes were the most rapidly upregulated, with Su(H), Pol II, and transcript levels increasing within 5-10 minutes. Other genes had a more delayed response, the timing of which was largely unaffected by more prolonged Notch activation. Neither Su(H) binding nor poised Pol II could fully explain the differences between profiles. Instead, our data indicate that regulatory interactions, driven by the early-responding E(spl)bHLH genes, are required. Proposed cross-regulatory relationships were validated in vivo and in cell culture, supporting the view that feed-forward repression by E(spl)bHLH/Hes shapes the response of late-responding genes. Based on these data, we propose a model in which Hes genes are responsible for co-ordinating the Notch response of a wide spectrum of other targets, explaining the critical functions these key regulators play in many developmental and disease contexts

    Non-linear regression models for Approximate Bayesian Computation

    Full text link
    Approximate Bayesian inference on the basis of summary statistics is well-suited to complex problems for which the likelihood is either mathematically or computationally intractable. However the methods that use rejection suffer from the curse of dimensionality when the number of summary statistics is increased. Here we propose a machine-learning approach to the estimation of the posterior density by introducing two innovations. The new method fits a nonlinear conditional heteroscedastic regression of the parameter on the summary statistics, and then adaptively improves estimation using importance sampling. The new algorithm is compared to the state-of-the-art approximate Bayesian methods, and achieves considerable reduction of the computational burden in two examples of inference in statistical genetics and in a queueing model.Comment: 4 figures; version 3 minor changes; to appear in Statistics and Computin
    corecore