20 research outputs found
Mutanalyst, an online tool for assessing the mutational spectrum of epPCR libraries with poor sampling
BACKGROUND: Assessing library diversity is an important control step in a directed evolution experiment. To do this, a limited amount of colonies from a test library are sequenced and tested. In the case of an error-prone PCR library, the spectrum of the identified mutations — the proportions of mutations of a specific nucleobase to another— is calculated enabling the user to make more informed predictions on library diversity and coverage. However, the calculations of the mutational spectrum are severely affected by the limited sample sizes. RESULTS: Here an online program, called Mutanalyst, is presented, which not only automates the calculations, but also estimates errors involved. Specifically, the errors are calculated thanks to the complementarity of DNA, which means that a mutation has a complementary mutation on the other sequence. Additionally, in the case of determining the mean number of mutations per sequence it does so by fitting to a Poisson distribution, which is more robust than calculating the average in light of the small sampling size. CONCLUSION: As a result of the added measures to keep into account of small sample size the user can better assess whether the library is satisfactory or whether error-prone PCR conditions should be adjusted. The program is available at www.mutanalyst.com
Harmonic moments of non homogeneous branching processes
We study the harmonic moments of Galton-Watson processes, possibly non
homogeneous, with positive values. Good estimates of these are needed to
compute unbiased estimators for non canonical branching
Markov processes, which occur, for instance, in the modeling of the
polymerase chain reaction. By convexity, the ratio of the harmonic mean to the
mean is at most 1. We prove that, for every square integrable branching
mechanisms, this ratio lies between 1-A/k and 1-B/k for every initial
population of size k greater than A. The positive constants A and B, such that
B is at most A, are explicit and depend only on the generation-by-generation
branching mechanisms. In particular, we do not use the distribution of the
limit of the classical martingale associated to the Galton-Watson process.
Thus, emphasis is put on non asymptotic bounds and on the dependence of the
harmonic mean upon the size of the initial population. In the Bernoulli case,
which is relevant for the modeling of the polymerase chain reaction, we prove
essentially optimal bounds that are valid for every initial population.
Finally, in the general case and for large enough initial populations, similar
techniques yield sharp estimates of the harmonic moments of higher degrees
Confidence intervals for nonhomogeneous branching processes and polymerase chain reactions
We extend in two directions our previous results about the sampling and the
empirical measures of immortal branching Markov processes. Direct applications
to molecular biology are rigorous estimates of the mutation rates of polymerase
chain reactions from uniform samples of the population after the reaction.
First, we consider nonhomogeneous processes, which are more adapted to real
reactions. Second, recalling that the first moment estimator is analytically
known only in the infinite population limit, we provide rigorous confidence
intervals for this estimator that are valid for any finite population. Our
bounds are explicit, nonasymptotic and valid for a wide class of nonhomogeneous
branching Markov processes that we describe in detail. In the setting of
polymerase chain reactions, our results imply that enlarging the size of the
sample becomes useless for surprisingly small sizes. Establishing confidence
intervals requires precise estimates of the second moment of random samples.
The proof of these estimates is more involved than the proofs that allowed us,
in a previous paper, to deal with the first moment. On the other hand, our
method uses various, seemingly new, monotonicity properties of the harmonic
moments of sums of exchangeable random variables.Comment: Published at http://dx.doi.org/10.1214/009117904000000775 in the
Annals of Probability (http://www.imstat.org/aop/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Evolution favors protein mutational robustness in sufficiently large populations
BACKGROUND: An important question is whether evolution favors properties such
as mutational robustness or evolvability that do not directly benefit any
individual, but can influence the course of future evolution. Functionally
similar proteins can differ substantially in their robustness to mutations and
capacity to evolve new functions, but it has remained unclear whether any of
these differences might be due to evolutionary selection for these properties.
RESULTS: Here we use laboratory experiments to demonstrate that evolution
favors protein mutational robustness if the evolving population is sufficiently
large. We neutrally evolve cytochrome P450 proteins under identical selection
pressures and mutation rates in populations of different sizes, and show that
proteins from the larger and thus more polymorphic population tend towards
higher mutational robustness. Proteins from the larger population also evolve
greater stability, a biophysical property that is known to enhance both
mutational robustness and evolvability. The excess mutational robustness and
stability is well described by existing mathematical theories, and can be
quantitatively related to the way that the proteins occupy their neutral
network.
CONCLUSIONS: Our work is the first experimental demonstration of the general
tendency of evolution to favor mutational robustness and protein stability in
highly polymorphic populations. We suggest that this phenomenon may contribute
to the mutational robustness and evolvability of viruses and bacteria that
exist in large populations
Mutation supply and the repeatability of selection for antibiotic resistance
Whether evolution can be predicted is a key question in evolutionary biology.
Here we set out to better understand the repeatability of evolution. We
explored experimentally the effect of mutation supply and the strength of
selective pressure on the repeatability of selection from standing genetic
variation. Different sizes of mutant libraries of an antibiotic resistance
gene, TEM-1 -lactamase in Escherichia coli, were subjected to different
antibiotic concentrations. We determined whether populations went extinct or
survived, and sequenced the TEM gene of the surviving populations. The
distribution of mutations per allele in our mutant libraries- generated by
error-prone PCR- followed a Poisson distribution. Extinction patterns could be
explained by a simple stochastic model that assumed the sampling of beneficial
mutations was key for survival. In most surviving populations, alleles
containing at least one known large-effect beneficial mutation were present.
These genotype data also support a model which only invokes sampling effects to
describe the occurrence of alleles containing large-effect driver mutations.
Hence, evolution is largely predictable given cursory knowledge of mutational
fitness effects, the mutation rate and population size. There were no clear
trends in the repeatability of selected mutants when we considered all
mutations present. However, when only known large-effect mutations were
considered, the outcome of selection is less repeatable for large libraries, in
contrast to expectations. Furthermore, we show experimentally that alleles
carrying multiple mutations selected from large libraries confer higher
resistance levels relative to alleles with only a known large-effect mutation,
suggesting that the scarcity of high-resistance alleles carrying multiple
mutations may contribute to the decrease in repeatability at large library
sizes.Comment: 31pages, 9 figure
Recommended from our members
Validation of an STR peak area model
In analyzing a DNA mixture sample, the measured peak areas of alleles of STR markers amplified using the polymerase chain-reaction (PCR) technique provide valuable information concerning the relative amounts of DNA originating from each contributor to the mixture. This information can be exploited for the purpose of trying to predict the genetic profiles of those contributors whose genetic profiles are not known. The task is non-trivial, in part due to the need to take into account the stochastic nature of peak area values. Various methods have been proposed suggesting ways in which this may be done. One recent suggestion is a probabilistic expert system model that uses gamma distributions to model the size and stochastic variation in peak area values. In this paper we carry out a statistical analysis of the gamma distribution assumption, testing the assumption against synthetic peak area values computer generated using an independent model that simulates the PCR amplification process. Our analysis shows the gamma assumption works very well when allelic dropout is not present, but performs less and less well as dropout becomes more and more of an issue, such as occurs, for example, in Low Copy Template amplifications
A graphical simulation model of the entire DNA process associated with the analysis of short tandem repeat loci
The use of expert systems to interpret short tandem repeat DNA profiles in forensic, medical and ancient DNA applications is becoming increasingly prevalent as high-throughput analytical systems generate large amounts of data that are time-consuming to process. With special reference to low copy number (LCN) applications, we use a graphical model to simulate stochastic variation associated with the entire DNA process starting with extraction of sample, followed by the processing associated with the preparation of a PCR reaction mixture and PCR itself. Each part of the process is modelled with input efficiency parameters. Then, the key output parameters that define the characteristics of a DNA profile are derived, namely heterozygote balance (Hb) and the probability of allelic drop-out p(D). The model can be used to estimate the unknown efficiency parameters, such as π(extraction). ‘What-if’ scenarios can be used to improve and optimize the entire process, e.g. by increasing the aliquot forwarded to PCR, the improvement expected to a given DNA profile can be reliably predicted. We demonstrate that Hb and drop-out are mainly a function of stochastic effect of pre-PCR molecular selection. Whole genome amplification is unlikely to give any benefit over conventional PCR for LCN