Search CORE

14 research outputs found

Additional file 3: of Comparison of discriminative motif optimization using matrix and DNA shape-based models

Author: Gary Stormo (4916626)
Shuxiang Ruan (603590)
Publication venue
Publication date
Field of study

Table S3. Scores for the motif optimization algorithms on ChIP-seq data with small training sets. (DOCX 12 kb

The Francis Crick Institute

Additional file 4: of Comparison of discriminative motif optimization using matrix and DNA shape-based models

Author: Gary Stormo (4916626)
Shuxiang Ruan (603590)
Publication venue
Publication date
Field of study

Table S4. AUPRC and AUROC differences between model pairs by TF. (XLSX 32 kb

The Francis Crick Institute

Additional file 1: of Comparison of discriminative motif optimization using matrix and DNA shape-based models

Author: Gary Stormo (4916626)
Shuxiang Ruan (603590)
Publication venue
Publication date
Field of study

Table S1. Mean AUROC (and standard deviation) on ChIP-seq data. (DOCX 13 kb

The Francis Crick Institute

Additional file 2: of Comparison of discriminative motif optimization using matrix and DNA shape-based models

Author: Gary Stormo (4916626)
Shuxiang Ruan (603590)
Publication venue
Publication date
Field of study

Table S2. Effect of Method for Generating Negative Sequences on Training and Testing Scores. (DOCX 13 kb

The Francis Crick Institute

The non-linear relationship between binding affinity and probability.

Author: Gary D. Stormo (9112)
Shuxiang Ruan (603590)
Publication venue
Publication date
Field of study

The C to A mutation that occurs at the same position but in two different sequence contexts causes dramatically different changes in the binding probability of the whole sequence.</p

The Francis Crick Institute

The energy matrix, derived probabilistic models and corresponding logos of a typical simulation.

Author: Gary D. Stormo (9112)
Shuxiang Ruan (603590)
Publication venue
Publication date
Field of study

(a) The energy logo, with the average energy for each position set to 0. (b) The energy matrix generated from simulation. (c) The information logo when μ = −3. (d) The probabilistic model derived from all binding sites when μ = −3. (Matrix elements are frequency of each base at each position; probability × 100.) (e) The information logo when μ = 3. (f) The probabilistic model derived from all binding sites when μ = 3.</p

The Francis Crick Institute

The rank correlation between the predicted and true all sequence distributions.

Author: Gary D. Stormo (9112)
Shuxiang Ruan (603590)
Publication venue
Publication date
Field of study

The rank correlation between the predicted and true all sequence distributions.</p

The Francis Crick Institute

The macroevolution simulation suggests the possibility that the robustness of the C. elegans lineage arose as an adaptation to necrosis but not program failure.

Author: Jian-Rong Yang (174590)
Jianzhi Zhang (14669)
Shuxiang Ruan (603590)
Publication venue
Publication date
Field of study

(A–E) Frequency distributions of (A) lineage robustness in the presence of necrosis (fn), (B) maximum depth, (C) mean depth, (D) rare-early correlation, and (E) lineage robustness in the presence of program failure (fp) among lineages generated from the macroevolution with different intensities of selection for high fn. The observed values from the C. elegans lineage are indicated by black arrows. Each distribution in each panel is based on 100 simulation replications. The number next to the color scheme shows the fraction of most robust lineages from which the progenitor of next evolutionary expansion of cell lineage is randomly chosen. That is, the lower the number, the stronger the selection. (F–J) Frequency distributions of (F) lineage robustness in the presence of program failure (fp), (G) maximum depth, (H) mean depth, (I) rare-early correlation, and (J) lineage robustness in the presence of necrosis (fn) among lineages generated from the macroevolution with different intensities of selection for high fp. The observed values from the C. elegans lineage are indicated by black arrows.</p

The Francis Crick Institute

The tendency for rare cells to have low depths improves the robustness of the C. elegans lineage.

Author: Jian-Rong Yang (174590)
Jianzhi Zhang (14669)
Shuxiang Ruan (603590)
Publication venue
Publication date
Field of study

(A) Positive correlation between the depth of a terminal cell and its cell type size. Spearman's rank correlation (ρ) for the original unbinned data and the associated P-value are presented. Error bars show one standard deviation of the depth within a cell type. Bla, blast; Epi, epithelial; Ger, germ; Gla, gland; Int, intestinal; Mus, muscle; Str, neural structural; Neu, neuron. The rare-early correlation remains strong even when the germ cells are removed (ρ = 0.515, P<10−38). (B) Frequency distribution of the rare-early correlation coefficient from 10,000 random lineages that have the same topology as that of C. elegans but have their terminal cells randomly relabeled. The arrow indicates the correlation coefficient for the C. elegans lineage. P-value is the probability that a random lineage above generated has a higher rare-early correlation than that observed in C. elegans. Z-score is the number of standard deviations by which the observed correlation deviates from the expected correlation of the random lineages with the same topology. (C–D) The stronger the rare-early correlation (ρrare-early) in a random lineage, the higher the robustness of the lineage in the presence of (C) necrosis or (D) program failure. Although 10,000 random lineages are generated, for clarity, only 1000 are shown (grey dots). The dashed line is the linear least-square regression of these 1000 dots. The rank correlation between ρrare-early and robustness, as well as the associated P-value, are calculated from all 10,000 lineages. The C. elegans lineage is represented by a triangle.</p

The Francis Crick Institute

Low depths of terminal cells improve the robustness of the C. elegans lineage to necrosis and program failure.

Author: Jian-Rong Yang (174590)
Jianzhi Zhang (14669)
Shuxiang Ruan (603590)
Publication venue
Publication date
Field of study

(A) Frequency distribution of the maximum cell depth in 10,000 lineages (grey bars), which are generated by random coalescence of the terminal cells of the C. elegans lineage. The arrow indicates the observed maximum cell depth in the C. elegans lineage. P-value is the probability that the maximum depth of a random lineage is smaller than that of C. elegans. Z-score is the number of standard deviations by which the observation deviates from the mean of the random lineages. (B–C) Violin plot for the robustness of randomly generated lineages with defined maximum depths in the presence of (B) necrosis or (C) program failure. Each violin is essentially a horizontal histogram showing the relative probability densities of different robustness of random lineages with the indicated maximum depth. The horizontal line in each violin plot shows the mean value. The real lineage is shown by a triangle. P-value is the probability that the robustness of a random lineage (with the same maximum depth as that of C. elegans) is higher than that of C. elegans. Z-score is the number of standard deviations by which the observation deviates from the mean of the random lineages. (D) Frequency distribution of the mean terminal cell depth in 5,000 lineages (grey bars), which are generated by random coalescence of the terminal cells of the C. elegans lineage with the requirement that the maximum depth is the same as in C. elegans. The arrow indicates the observed mean depth in the C. elegans lineage. P-value is the probability that the mean depth is smaller in a random lineage than in C. elegans when their maximum depths are the same. (E–F) Violin plot for the robustness of randomly generated lineages with the maximum depth equal to that of C. elegans and defined mean depths, in the presence of (E) necrosis or (F) program failure. The real lineage is indicated by a triangle. P-value is the probability that the robustness is higher in a random lineage (with the same maximum depth and similar mean depth as those of C. elegans) than in C. elegans. Z-score is the number of standard deviations by which the observation deviates from the mean of the random lineages.</p

The Francis Crick Institute

Additional file 3: of Comparison of discriminative motif optimization using matrix and DNA shape-based models

Additional file 4: of Comparison of discriminative motif optimization using matrix and DNA shape-based models

Additional file 1: of Comparison of discriminative motif optimization using matrix and DNA shape-based models

Additional file 2: of Comparison of discriminative motif optimization using matrix and DNA shape-based models

The non-linear relationship between binding affinity and probability.

The energy matrix, derived probabilistic models and corresponding logos of a typical simulation.

The rank correlation between the predicted and true all sequence distributions.

The macroevolution simulation suggests the possibility that the robustness of the <i>C. elegans</i> lineage arose as an adaptation to necrosis but not program failure.

The tendency for rare cells to have low depths improves the robustness of the <i>C. elegans</i> lineage.

Low depths of terminal cells improve the robustness of the <i>C. elegans</i> lineage to necrosis and program failure.