31 research outputs found

    Estimation of universal and taxon-specific parameters of prokaryotic genome evolution

    No full text
    <div><p>The results of our recent study on mathematical modeling of microbial genome evolution indicate that, on average, genomes of bacteria and archaea evolve in the regime of mutation-selection balance defined by positive selection coefficients associated with gene acquisition that is counter-acted by the intrinsic deletion bias. This analysis was based on the strong assumption that parameters of genome evolution are universal across the diversity of bacteria and archaea, and yielded extremely low values of the selection coefficient. Here we further refine the modeling approach by taking into account evolutionary factors specific for individual groups of microbes using two independent fitting strategies, an ad hoc hard fitting scheme and a mixture model. The resulting estimate of the mean selection coefficient of <i>s</i>∼10<sup>−10</sup> associated with the gain of one gene implies that, on average, acquisition of a gene is beneficial, and that microbial genomes typically evolve under a weak selection regime that might transition to strong selection in highly abundant organisms with large effective population sizes. The apparent selective pressure towards larger genomes is balanced by the deletion bias, which is estimated to be consistently greater than unity for all analyzed groups of microbes. The estimated values of <i>s</i> are more realistic than the lower values obtained previously, indicating that global and group-specific evolutionary factors synergistically affect microbial genome evolution that seems to be driven primarily by adaptation to existence in diverse niches.</p></div

    The Universal Molecular Clock and Universal Pacemaker models of genome evolution.

    No full text
    <p>A. Under the Molecular Clock model, gene-specific evolution rates (colored lines) remain constant; at any point in time (shown as dots), the relative rates of gene evolution are also constant. B. Under the Universal Pacemaker model, gene-specific evolution rates can change arbitrarily but by the same amount across the entire genome; at any point in time, the relative rates of gene evolution remain constant.</p

    Prokaryotic genome size distribution width plotted vs. genome size.

    No full text
    <p>The standard deviation is taken as the proxy for the distribution width. ATGCs are indicated by circles and model fits by lines. (A) Model prediction using the deletion bias of Eq (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0195571#pone.0195571.e013" target="_blank">13</a>) with parameters optimized under the assumption that all three model parameters as universal [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0195571#pone.0195571.ref010" target="_blank">10</a>]. (B) Six model fits with the deletion bias of Eq (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0195571#pone.0195571.e013" target="_blank">13</a>) (fitted parameters are given in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0195571#pone.0195571.t001" target="_blank">Table 1</a>). In all fits, one model parameter was set as a latent variable. The model parameter that was set as a latent variable and the methodology used for fitting are indicated in the inset; fits that were visually indistinguishable are represented by the same line. H, hard fitting method; B, mixture model. (C) Same as panel B, for the deletion bias of Eq (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0195571#pone.0195571.e016" target="_blank">16</a>) (fitted parameters are given in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0195571#pone.0195571.t002" target="_blank">Table 2</a>).</p

    Maximum and minimum equilibrium genome sizes calculated using Eq (7) with parameters fitted under the mixture model.

    No full text
    <p>Latent variables and deletion bias models are indicated in the inset. The effective population size was set as <i>N</i><sub><i>e</i></sub> = 10<sup>9</sup>. For each fit, the latent variable was taken from the left tail (percentiles 1–10) or the right tail (percentiles 90–99) of the optimized distribution of the latent variable. All estimates for maximum or minimum genome sizes, based on the different choices of the latent variable, are plotted together. As a result the same figure mixes distributions left and right tail for different choices of <i>φ</i>. (A) For <i>φ</i> = <i>r</i>′ and <i>φ</i> = <i>λ</i> the <i>x</i> axis indicates 1 – <i>P</i>, where <i>P</i> is the percentile. (B) For <i>φ</i> = <i>s</i> and <i>φ</i> = <i>a</i> the <i>x</i> axis indicates 1 − <i>P</i>, where <i>P</i> is the percentile.</p

    Estimation of universal and taxon-specific parameters of prokaryotic genome evolution - Fig 5

    No full text
    <p><b>Fitted latent variable values under the linear deletion bias model (Eq (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0195571#pone.0195571.e016" target="_blank">16</a>)) for <i>φ</i> = <i>s</i> (A–C), <i>φ</i> = <i>a</i> (D–F) and <i>φ</i></b> = <b><i>b</i> (G–I).</b> The fits were obtained using the hard fitting methodology (blue) and the mixture model (orange). The fitted <i>φ</i> values for all ATGCs are plotted against the effective population size in the leftmost column. Values are indicated by markers and mean values of the distributions are indicated by dashed lines. Fitted <i>φ</i> values histograms are shown together with latent variable distributions, which are indicated by solid lines. The parameters of the distributions are given in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0195571#pone.0195571.t002" target="_blank">Table 2</a>. Histograms obtained using the hard fitting methodology are shown in the middle column, and histograms obtained using the mixture model are shown in the rightmost column.</p

    Comparison of the Molecular Clock and Universal Pacemaker models of genome evolution.

    No full text
    <p>Comparison of the Molecular Clock and Universal Pacemaker models of genome evolution.</p

    Genome size and selection strength in prokaryotes.

    No full text
    <p>(<b>A</b>) Mean number of genes <i>x</i> is plotted against inferred selection strength d<i>N</i>/d<i>S</i> where each point represents one prokaryotic cluster (ATGC). Error bars represent genome sizes distributions widths and indicate one standard deviation. (<b>B</b>) Mean number of genes is plotted against extracted effective population size <i>N</i><sub><i>e</i></sub>. A representative global trend curve of mean genome size as predicted by the model (see Eq (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0195571#pone.0195571.e007" target="_blank">7</a>)), where all model parameters are assumed to be global <b><i>θ</i></b> = {<i>s</i>,<i>r</i>′,<i>λ</i>} is indicated by a red line. The approach implemented in the hard fitting methodology, where Eq (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0195571#pone.0195571.e007" target="_blank">7</a>) is used in order to set latent variable value such that model distributions are centered around observed genome sizes, is illustrated in a dashed orange line.</p

    Comparison of the observed and model-generated genome size distributions for 6 ATGCs that consist of at least 20 species.

    No full text
    <p>Empirical genome sizes are indicated by bars and model distributions by red solid lines. For model distributions Eq (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0195571#pone.0195571.e006" target="_blank">6</a>) was used, together with the deletion bias of Eq (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0195571#pone.0195571.e016" target="_blank">16</a>). Model parameters were optimized using the mixture model method, with the linear coefficient <i>a</i> of the acquisition rate (see Eq (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0195571#pone.0195571.e014" target="_blank">14</a>)) as latent variable. Optimized parameters are listed in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0195571#pone.0195571.t002" target="_blank">Table 2</a> and in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0195571#pone.0195571.s006" target="_blank">S2 Table</a>. The ATGCs are as follows (the numbers of genomes for each ATGC are indicated in parentheses): (A) ATGC0001 (109), (B) ATGC0003 (22), (C) ATGC0004 (22), (D) ATGC0014 (31). (E) ATGC0021 (45) and (F) ATGC0050 (51).</p

    Estimation of universal and taxon-specific parameters of prokaryotic genome evolution - Fig 4

    No full text
    <p><b>Fitted latent variable values under the power law deletion bias model (Eq (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0195571#pone.0195571.e013" target="_blank">13</a>)) for <i>φ</i> = <i>s</i> (A–C), <i>φ</i> = <i>r</i>′ (D–F) and <i>φ</i></b> = <b><i>λ</i> (G–I)</b>. The fits were obtained using the hard fitting methodology (blue) and the mixture model (orange). Fitted <i>φ</i> values for all ATGCs are plotted against the effective population size in the leftmost column. The mean values of the distributions are indicated by dashed lines. The fitted <i>φ</i> values histograms are shown together with the latent variable distributions, which are indicated by solid lines. The distribution parameters are given in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0195571#pone.0195571.t001" target="_blank">Table 1</a>. Histograms obtained using the hard fitting methodology are shown in the middle column, and histograms obtained under the mixture model are shown in the rightmost column.</p
    corecore