20 research outputs found

    Mutanalyst, an online tool for assessing the mutational spectrum of epPCR libraries with poor sampling

    BACKGROUND: Assessing library diversity is an important control step in a directed evolution experiment. To do this, a limited amount of colonies from a test library are sequenced and tested. In the case of an error-prone PCR library, the spectrum of the identified mutations — the proportions of mutations of a specific nucleobase to another— is calculated enabling the user to make more informed predictions on library diversity and coverage. However, the calculations of the mutational spectrum are severely affected by the limited sample sizes. RESULTS: Here an online program, called Mutanalyst, is presented, which not only automates the calculations, but also estimates errors involved. Specifically, the errors are calculated thanks to the complementarity of DNA, which means that a mutation has a complementary mutation on the other sequence. Additionally, in the case of determining the mean number of mutations per sequence it does so by fitting to a Poisson distribution, which is more robust than calculating the average in light of the small sampling size. CONCLUSION: As a result of the added measures to keep into account of small sample size the user can better assess whether the library is satisfactory or whether error-prone PCR conditions should be adjusted. The program is available at www.mutanalyst.com

    Harmonic moments of non homogeneous branching processes

    We study the harmonic moments of Galton-Watson processes, possibly non homogeneous, with positive values. Good estimates of these are needed to compute unbiased estimators for non canonical branching Markov processes, which occur, for instance, in the modeling of the polymerase chain reaction. By convexity, the ratio of the harmonic mean to the mean is at most 1. We prove that, for every square integrable branching mechanisms, this ratio lies between 1-A/k and 1-B/k for every initial population of size k greater than A. The positive constants A and B, such that B is at most A, are explicit and depend only on the generation-by-generation branching mechanisms. In particular, we do not use the distribution of the limit of the classical martingale associated to the Galton-Watson process. Thus, emphasis is put on non asymptotic bounds and on the dependence of the harmonic mean upon the size of the initial population. In the Bernoulli case, which is relevant for the modeling of the polymerase chain reaction, we prove essentially optimal bounds that are valid for every initial population. Finally, in the general case and for large enough initial populations, similar techniques yield sharp estimates of the harmonic moments of higher degrees

    Confidence intervals for nonhomogeneous branching processes and polymerase chain reactions

    We extend in two directions our previous results about the sampling and the empirical measures of immortal branching Markov processes. Direct applications to molecular biology are rigorous estimates of the mutation rates of polymerase chain reactions from uniform samples of the population after the reaction. First, we consider nonhomogeneous processes, which are more adapted to real reactions. Second, recalling that the first moment estimator is analytically known only in the infinite population limit, we provide rigorous confidence intervals for this estimator that are valid for any finite population. Our bounds are explicit, nonasymptotic and valid for a wide class of nonhomogeneous branching Markov processes that we describe in detail. In the setting of polymerase chain reactions, our results imply that enlarging the size of the sample becomes useless for surprisingly small sizes. Establishing confidence intervals requires precise estimates of the second moment of random samples. The proof of these estimates is more involved than the proofs that allowed us, in a previous paper, to deal with the first moment. On the other hand, our method uses various, seemingly new, monotonicity properties of the harmonic moments of sums of exchangeable random variables.Comment: Published at http://dx.doi.org/10.1214/009117904000000775 in the Annals of Probability (http://www.imstat.org/aop/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Evolution favors protein mutational robustness in sufficiently large populations

    BACKGROUND: An important question is whether evolution favors properties such as mutational robustness or evolvability that do not directly benefit any individual, but can influence the course of future evolution. Functionally similar proteins can differ substantially in their robustness to mutations and capacity to evolve new functions, but it has remained unclear whether any of these differences might be due to evolutionary selection for these properties. RESULTS: Here we use laboratory experiments to demonstrate that evolution favors protein mutational robustness if the evolving population is sufficiently large. We neutrally evolve cytochrome P450 proteins under identical selection pressures and mutation rates in populations of different sizes, and show that proteins from the larger and thus more polymorphic population tend towards higher mutational robustness. Proteins from the larger population also evolve greater stability, a biophysical property that is known to enhance both mutational robustness and evolvability. The excess mutational robustness and stability is well described by existing mathematical theories, and can be quantitatively related to the way that the proteins occupy their neutral network. CONCLUSIONS: Our work is the first experimental demonstration of the general tendency of evolution to favor mutational robustness and protein stability in highly polymorphic populations. We suggest that this phenomenon may contribute to the mutational robustness and evolvability of viruses and bacteria that exist in large populations

    Mutation supply and the repeatability of selection for antibiotic resistance

    Whether evolution can be predicted is a key question in evolutionary biology. Here we set out to better understand the repeatability of evolution. We explored experimentally the effect of mutation supply and the strength of selective pressure on the repeatability of selection from standing genetic variation. Different sizes of mutant libraries of an antibiotic resistance gene, TEM-1 β\beta-lactamase in Escherichia coli, were subjected to different antibiotic concentrations. We determined whether populations went extinct or survived, and sequenced the TEM gene of the surviving populations. The distribution of mutations per allele in our mutant libraries- generated by error-prone PCR- followed a Poisson distribution. Extinction patterns could be explained by a simple stochastic model that assumed the sampling of beneficial mutations was key for survival. In most surviving populations, alleles containing at least one known large-effect beneficial mutation were present. These genotype data also support a model which only invokes sampling effects to describe the occurrence of alleles containing large-effect driver mutations. Hence, evolution is largely predictable given cursory knowledge of mutational fitness effects, the mutation rate and population size. There were no clear trends in the repeatability of selected mutants when we considered all mutations present. However, when only known large-effect mutations were considered, the outcome of selection is less repeatable for large libraries, in contrast to expectations. Furthermore, we show experimentally that alleles carrying multiple mutations selected from large libraries confer higher resistance levels relative to alleles with only a known large-effect mutation, suggesting that the scarcity of high-resistance alleles carrying multiple mutations may contribute to the decrease in repeatability at large library sizes.Comment: 31pages, 9 figure

    A graphical simulation model of the entire DNA process associated with the analysis of short tandem repeat loci

    The use of expert systems to interpret short tandem repeat DNA profiles in forensic, medical and ancient DNA applications is becoming increasingly prevalent as high-throughput analytical systems generate large amounts of data that are time-consuming to process. With special reference to low copy number (LCN) applications, we use a graphical model to simulate stochastic variation associated with the entire DNA process starting with extraction of sample, followed by the processing associated with the preparation of a PCR reaction mixture and PCR itself. Each part of the process is modelled with input efficiency parameters. Then, the key output parameters that define the characteristics of a DNA profile are derived, namely heterozygote balance (Hb) and the probability of allelic drop-out p(D). The model can be used to estimate the unknown efficiency parameters, such as π(extraction). ‘What-if’ scenarios can be used to improve and optimize the entire process, e.g. by increasing the aliquot forwarded to PCR, the improvement expected to a given DNA profile can be reliably predicted. We demonstrate that Hb and drop-out are mainly a function of stochastic effect of pre-PCR molecular selection. Whole genome amplification is unlikely to give any benefit over conventional PCR for LCN