82 research outputs found
Summarizing the test accuracy of <i>L</i><sub>0</sub> classifiers for different levels of evidence (<i>F</i><sub>1</sub>: mRNA expression, <i>F</i><sub>2</sub>: miRNA expression, <i>F</i><sub>3</sub>: DNA methylation, <i>F</i><sub>4</sub>: Protein expression, <i>F</i><sub><i>AE</i></sub>: Features from bottleneck layer of autoencoder, SVM: Support vector machine, RF: Random forest, FFNN: Feed-forward neural network).
Summarizing the test accuracy of L0 classifiers for different levels of evidence (F1: mRNA expression, F2: miRNA expression, F3: DNA methylation, F4: Protein expression, FAE: Features from bottleneck layer of autoencoder, SVM: Support vector machine, RF: Random forest, FFNN: Feed-forward neural network).</p
S1 Graphical abstract -
Cancer is a heterogeneous disease, and patients with tumors from different organs can share similar epigenetic and genetic alterations. Therefore, it is crucial to identify the novel subgroups of patients with similar molecular characteristics. It is possible to propose a better treatment strategy when the heterogeneity of the patient is accounted for during subgroup identification, irrespective of the tissue of origin. This work proposes a machine learning (ML) based pipeline for subgroup identification in pan-cancer. Here, mRNA, miRNA, DNA methylation, and protein expression features from pan-cancer samples were concatenated and non-linearly projected to a lower dimension using an ML algorithm. This data was then clustered to identify multi-omics-based novel subgroups. The clinical characterization of these ML subgroups indicated significant differences in overall survival (OS) and disease-free survival (DFS) (p-value</div
Mutation analysis of ML clusters.
(a) Non-silent mutation in each cluster sorted by median. (b) Heatmap showing the average value of mutational signature in each cluster.</p
Negatively enriched pathways in each subgroup obtained by GSEA analysis using hallmark geneset (FDR q≤0.05).
Negatively enriched pathways in each subgroup obtained by GSEA analysis using hallmark geneset (FDR q≤0.05).</p
Figure showing the proportion of samples from each tumor type in each ML cluster.
Figure showing the proportion of samples from each tumor type in each ML cluster.</p
The algorithm used to compute the values of <i>α</i>, <i>β</i>, and <i>γ</i> used to compute the prediction probabilities for the linear decision-level fused classification model.
The algorithm used to compute the values of α, β, and γ used to compute the prediction probabilities for the linear decision-level fused classification model.</p
Heatmap showing the distribution of LM22 immune cells in each cluster.
Heatmap showing the distribution of LM22 immune cells in each cluster.</p
Steps followed in preprocessing each data type for molecular characterization.
Steps followed in preprocessing each data type for molecular characterization.</p
Immune cells from LM22 gene signature enriched in each subgroup identified by Cibersort analysis.
Immune cells from LM22 gene signature enriched in each subgroup identified by Cibersort analysis.</p
Positively enriched pathways in each subgroup obtained by GSEA analysis using hallmark geneset (FDR q≤0.05).
Positively enriched pathways in each subgroup obtained by GSEA analysis using hallmark geneset (FDR q≤0.05).</p
- …
