14 research outputs found
Additional file 3: of Comparison of discriminative motif optimization using matrix and DNA shape-based models
Table S3. Scores for the motif optimization algorithms on ChIP-seq data with small training sets. (DOCX 12 kb
Additional file 4: of Comparison of discriminative motif optimization using matrix and DNA shape-based models
Table S4. AUPRC and AUROC differences between model pairs by TF. (XLSX 32 kb
Additional file 1: of Comparison of discriminative motif optimization using matrix and DNA shape-based models
Table S1. Mean AUROC (and standard deviation) on ChIP-seq data. (DOCX 13 kb
Additional file 2: of Comparison of discriminative motif optimization using matrix and DNA shape-based models
Table S2. Effect of Method for Generating Negative Sequences on Training and Testing Scores. (DOCX 13 kb
The non-linear relationship between binding affinity and probability.
<p>The C to A mutation that occurs at the same position but in two different sequence contexts causes dramatically different changes in the binding probability of the whole sequence.</p
The energy matrix, derived probabilistic models and corresponding logos of a typical simulation.
<p>(a) The energy logo, with the average energy for each position set to 0. (b) The energy matrix generated from simulation. (c) The information logo when <i>μ</i> = −3. (d) The probabilistic model derived from all binding sites when <i>μ</i> = −3. (Matrix elements are frequency of each base at each position; probability × 100.) (e) The information logo when <i>μ</i> = 3. (f) The probabilistic model derived from all binding sites when <i>μ</i> = 3.</p
The rank correlation between the predicted and true all sequence distributions.
<p>The rank correlation between the predicted and true all sequence distributions.</p
The macroevolution simulation suggests the possibility that the robustness of the <i>C. elegans</i> lineage arose as an adaptation to necrosis but not program failure.
<p>(A–E) Frequency distributions of (A) lineage robustness in the presence of necrosis (<i>f</i><sub>n</sub>), (B) maximum depth, (C) mean depth, (D) rare-early correlation, and (E) lineage robustness in the presence of program failure (<i>f</i><sub>p</sub>) among lineages generated from the macroevolution with different intensities of selection for high <i>f</i><sub>n</sub>. The observed values from the <i>C. elegans</i> lineage are indicated by black arrows. Each distribution in each panel is based on 100 simulation replications. The number next to the color scheme shows the fraction of most robust lineages from which the progenitor of next evolutionary expansion of cell lineage is randomly chosen. That is, the lower the number, the stronger the selection. (F–J) Frequency distributions of (F) lineage robustness in the presence of program failure (<i>f</i><sub>p</sub>), (G) maximum depth, (H) mean depth, (I) rare-early correlation, and (J) lineage robustness in the presence of necrosis (<i>f</i><sub>n</sub>) among lineages generated from the macroevolution with different intensities of selection for high <i>f</i><sub>p</sub>. The observed values from the <i>C. elegans</i> lineage are indicated by black arrows.</p
The tendency for rare cells to have low depths improves the robustness of the <i>C. elegans</i> lineage.
<p>(A) Positive correlation between the depth of a terminal cell and its cell type size. Spearman's rank correlation (ρ) for the original unbinned data and the associated <i>P</i>-value are presented. Error bars show one standard deviation of the depth within a cell type. Bla, blast; Epi, epithelial; Ger, germ; Gla, gland; Int, intestinal; Mus, muscle; Str, neural structural; Neu, neuron. The rare-early correlation remains strong even when the germ cells are removed (ρ = 0.515, <i>P</i><10<sup>−38</sup>). (B) Frequency distribution of the rare-early correlation coefficient from 10,000 random lineages that have the same topology as that of <i>C. elegans</i> but have their terminal cells randomly relabeled. The arrow indicates the correlation coefficient for the <i>C. elegans</i> lineage. <i>P</i>-value is the probability that a random lineage above generated has a higher rare-early correlation than that observed in <i>C. elegans</i>. <i>Z</i>-score is the number of standard deviations by which the observed correlation deviates from the expected correlation of the random lineages with the same topology. (C–D) The stronger the rare-early correlation (ρ<sub>rare-early</sub>) in a random lineage, the higher the robustness of the lineage in the presence of (C) necrosis or (D) program failure. Although 10,000 random lineages are generated, for clarity, only 1000 are shown (grey dots). The dashed line is the linear least-square regression of these 1000 dots. The rank correlation between ρ<sub>rare-early</sub> and robustness, as well as the associated <i>P</i>-value, are calculated from all 10,000 lineages. The <i>C. elegans</i> lineage is represented by a triangle.</p
Low depths of terminal cells improve the robustness of the <i>C. elegans</i> lineage to necrosis and program failure.
<p>(A) Frequency distribution of the maximum cell depth in 10,000 lineages (grey bars), which are generated by random coalescence of the terminal cells of the <i>C. elegans</i> lineage. The arrow indicates the observed maximum cell depth in the <i>C. elegans</i> lineage. <i>P</i>-value is the probability that the maximum depth of a random lineage is smaller than that of <i>C. elegans</i>. <i>Z</i>-score is the number of standard deviations by which the observation deviates from the mean of the random lineages. (B–C) Violin plot for the robustness of randomly generated lineages with defined maximum depths in the presence of (B) necrosis or (C) program failure. Each violin is essentially a horizontal histogram showing the relative probability densities of different robustness of random lineages with the indicated maximum depth. The horizontal line in each violin plot shows the mean value. The real lineage is shown by a triangle. <i>P</i>-value is the probability that the robustness of a random lineage (with the same maximum depth as that of <i>C. elegans</i>) is higher than that of <i>C. elegans</i>. <i>Z</i>-score is the number of standard deviations by which the observation deviates from the mean of the random lineages. (D) Frequency distribution of the mean terminal cell depth in 5,000 lineages (grey bars), which are generated by random coalescence of the terminal cells of the <i>C. elegans</i> lineage with the requirement that the maximum depth is the same as in <i>C. elegans</i>. The arrow indicates the observed mean depth in the <i>C. elegans</i> lineage. <i>P</i>-value is the probability that the mean depth is smaller in a random lineage than in <i>C. elegans</i> when their maximum depths are the same. (E–F) Violin plot for the robustness of randomly generated lineages with the maximum depth equal to that of <i>C. elegans</i> and defined mean depths, in the presence of (E) necrosis or (F) program failure. The real lineage is indicated by a triangle. <i>P</i>-value is the probability that the robustness is higher in a random lineage (with the same maximum depth and similar mean depth as those of <i>C. elegans</i>) than in <i>C. elegans</i>. <i>Z</i>-score is the number of standard deviations by which the observation deviates from the mean of the random lineages.</p