19 research outputs found
Science, conscience et environnement : penser le monde complexe
<p><b>A)</b> Violin plot of the response of mTOR inhibitor Temsirolimus (IC50s, y-axis) across cancer cell lines. The cell lines are grouped (x-axis) as wild-type PTEN (WT), depicted with gray markers and gray violin outline, cell lines with any non-synonymous mutation in PTEN (Any), depicted in black, and cell lines with a mutation in one of the hotspots, depicted in blue. Mutation clusters and any non-synonymous mutation features that are significantly associated with drug response (FDR<10%) are depicted in red. <b>B)</b> Violin plot of the response of a PI3Kb inhibitor for mutation clusters in PIK3CA.</p
Gene Expression Association Statistics.
<p><b>A)</b> The number of cluster features (meaning clusters in the context of a tumor type) analyzed for gene expression association broken down by tumor type and by clustering method. A single cluster can be analyzed multiple times if it is present with greater than 4 mutations in multiple tumor types. <b>B)</b> The number of cluster features with significant global gene expression associations (solid) and pathway gene expression associations (hatched). Two significance levels (1% is lighter colors and 10% darker colors) are shown for each method. Here a cluster feature with multiple associations is only counted once. <b>C)</b> The number of significant global gene expression associations found by each method at different significance levels (1% is lighter colors and 10% darker colors). Hatched bars indicate associations with lower P-values than the corresponding any non-synonymous feature for the same gene. Here cluster features with multiple associations are counted multiple times. <b>D)</b> Same as (C) but for pathway gene expression associations.</p
Multiscale Information-Based Clustering Algorithm.
<p><b>A)</b> Pan-cancer mutation data is merged across all 23 tumor types for a single gene (PTEN). <b>B)</b> Gaussian kernel density estimates smooth this data at 28 different bandwidths or scales (a limited selection is shown for clarity). <b>C)</b> Each kernel density estimate is used to seed a multivariate mixture model of normal distributions and a single uniform distribution to represent background noise. Initial guesses for the locations of the normal distributions are determined from the local maxima of the kernel density estimates. Clusters from the mixture models (blue) are merged together using the greedy algorithm resulting in a final set of multiscale clusters (red). Green are duplicates of the red clusters and shown to clarify the process. Grey bars are excluded due to too few mutations. <b>D)</b> A mutation spectrum for PTEN. <b>E)</b> The two annotated protein domains in PTEN from PFAM.</p
Differential Pathway Associations for PTEN clusters in Uterine Corpus Endometrial Carcinoma.
<p>Clusters without significant pathway associations are omitted for clarity. A false discovery rate of 1% was used to filter for significance.</p
Clusters Highlighted in Protein Structures.
<p><b>A)</b> PIK3CA (gray) bound to PIK3R1 (orange). PIK3CA has two clusters (539–547 in green and 1043–1049 in blue) with very different global gene expression association significance levels in Breast Cancer (BRCA) discussed in the text. <b>B)</b> Residues 30–40 of CTNNB1 (blue) bound to BTRC (gray). This region of Beta-catenin is inside the 25–45 cluster which contains degradation regulating phosphorylated amino acids and is strongly associated with global gene expression changes in uterine corpus endometrial carcinoma (UCEC) and liver hepatocellular carcinoma (LIHC). Bottom bars in both plots show linear protein sequences with additional clusters in dark gray and PFAM protein domains in light gray. Mutation count histograms are shown for specific tumor types above the sequence with green dots representing synonymous mutations, blue dots representing missense mutations, and yellow dots nonsense mutations. Protein images created using UCSF Chimera [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005347#pcbi.1005347.ref026" target="_blank">26</a>].</p
Cluster Statistics Comparing Multiscale Clusters from M<sup>2</sup>C with the DBSCAN based method OncodriveCLUST and Pfam domains.
<p><b>A)</b> Cluster length histogram. <b>B)</b> Cluster mutation count histogram using all mutation types. <b>C)</b> Coverage with competing method histogram and Pfam. Cluster X is said to overlap cluster Y if over 50% of cluster X is covered by cluster Y. <b>D)</b> Cross-validation of mixture models: each circle shows the log-likelihood of the mixture model trained from partition 1 to generate the data from partition 2 for a single gene (red circles). The opposite analysis, using partition 2’s mixture model to generate data from partition 1, is also shown (purple x’s).</p
Method Illustration on PIK3CA in Breast Cancer: From top to bottom.
<p><b>A)</b> Protein domains in PIK3CA. <b>B)</b> In gray: mutation histogram showing all mutations across the various cancer types; these data were used to generate the multiscale clusters. In blue: breast cancer mutation histogram showing all non-synonymous mutations; these data were used to assign mutation features to breast cancer tumor samples. <b>C)</b> Mutation clusters identified by the multiscale information-based clustering algorithm (M<sup>2</sup>C). Gray clusters have fewer than 5 mutations in breast cancer and are excluded from subsequent downstream analysis. Green clusters are assigned to breast cancer. <b>D)</b> LRP8 gene expression levels in breast cancer where the samples are grouped based on the mutation clusters. From left to right: “wild-type” (i.e., no non-synonymous mutations including tumors with no mutations at all), any non-synonymous mutation feature, and the seven mutation clusters assigned to breast cancer. <b>E)</b> Pathway association P-value heatmap showing differential pathway associations between clusters. L1S1 In Neuronal Migration and Development Pathway and Reelin Signaling Pathway do include LRP8.</p
Statistical Methods Pipeline.
<p><b>A)</b> 549 genes with a total of 33507 pan-cancer mutations are run through our multiscale clustering algorithm resulting in 1295 clusters. <b>B)</b> Clusters are assigned to 4471 tumors samples across 23 tumor types creating a binary feature matrix. A tumor sample is said to be positive for a cluster if there is any non-synonymous mutation in the tumor and the cluster. <b>C)</b> The binary feature matrices are statistically compared to 2194 gene expression features separately for each cancer type using the Kruskal-Wallis Test. <b>D)</b> The pairwise P-values from the Kruskal-Wallis tests are combined globally and on the pathway level using the Empirical Brown’s Method across 172 Pathways. <b>E)</b> This resulted in 546810 association P-values.</p
Correlation between Clover scores and observed TF binding.
<p>Plots show relationship between Clover scores (y axes) and ChIP-seq counts (x axes) for motifs for IRF1 (VIRF8) and PU.1/SPI1 (V$PU1) for all eight clusters at 0, 2 and 4 h time points. Panes A-C show Clover scores and observed counts for enriched motifs (blue diamonds), correlation for those (black line) and Clover scores and counts for motifs that are not enriched (red squares). Panes D-F show Clover scores and ChIP-seq counts for the SPI1 motif separately for each cluster (three time points for each cluster).</p
Enrichment test results for ChIP-seq tags for specific TFs in HAc-valley regulatory elements within ±5 kbp of TSSs for genes in specific clusters.
<p>Enrichment test results for ChIP-seq tags for specific TFs in HAc-valley regulatory elements within ±5 kbp of TSSs for genes in specific clusters.</p