62 research outputs found
Identifying Subspace Gene Clusters from Microarray Data Using Low-Rank Representation
<div><p>Identifying subspace gene clusters from the gene expression data is useful for discovering novel functional gene interactions. In this paper, we propose to use low-rank representation (LRR) to identify the subspace gene clusters from microarray data. LRR seeks the lowest-rank representation among all the candidates that can represent the genes as linear combinations of the bases in the dataset. The clusters can be extracted based on the block diagonal representation matrix obtained using LRR, and they can well capture the intrinsic patterns of genes with similar functions. Meanwhile, the parameter of LRR can balance the effect of noise so that the method is capable of extracting useful information from the data with high level of background noise. Compared with traditional methods, our approach can identify genes with similar functions yet without similar expression profiles. Also, it could assign one gene into different clusters. Moreover, our method is robust to the noise and can identify more biologically relevant gene clusters. When applied to three public datasets, the results show that the LRR based method is superior to existing methods for identifying subspace gene clusters.</p> </div
Enriched combinations of significant annotations of Molecular Function of Cluster C17: (A) pie graph, (B) bar graph.
<p>Enriched combinations of significant annotations of Molecular Function of Cluster C17: (A) pie graph, (B) bar graph.</p
Enriched combinations of significant annotations of Biological Process of Cluster C17: (A) pie graph, (B) bar graph.
<p>Enriched combinations of significant annotations of Biological Process of Cluster C17: (A) pie graph, (B) bar graph.</p
Two heatmaps of expression values of genes analyzed by the proposed algorithm from the yeast dataset: (A) a heatmap of expression values of genes in Cluster C17, and the heatmap shows similar expression patterns of genes in different samples, (B) a heatmap of expression values of genes in Cluster C14, and the heatmap shows different expression patterns of genes in different samples (denoted as a and b).
<p>Two heatmaps of expression values of genes analyzed by the proposed algorithm from the yeast dataset: (A) a heatmap of expression values of genes in Cluster C17, and the heatmap shows similar expression patterns of genes in different samples, (B) a heatmap of expression values of genes in Cluster C14, and the heatmap shows different expression patterns of genes in different samples (denoted as a and b).</p
The most enriched categories of modular enrichment in each gene clusters uncovered by GPCA from yeast dataset.
<p>The columns of the table summarize the total sizes of the cluster(numbers in parentheses), the number of genes annotated in the cluster, the GO categories associated with the cluster, and the <i>P</i>-value after FDR correction.</p
Clustering and subspace clustering of a gene expression matrix: (A) a gene cluster must contain all columns, (B) subspace clusters correspond to arbitrary subsets of rows and columns, shown here as rectangles.
<p>Clustering and subspace clustering of a gene expression matrix: (A) a gene cluster must contain all columns, (B) subspace clusters correspond to arbitrary subsets of rows and columns, shown here as rectangles.</p
Comparison of statistical significance of enriched functional categories in gene clusters uncovered by LRR and <i>K</i>-means from yeast dataset.
<p>Only selected common significantly enriched functional categories are presented. The columns of the table summarize the GO categories associated with the cluster, the <i>P</i>-values after FDR correction by each approach, and the number of genes in the cluster that are annotated with the corresponding GO category versus the total size of the cluster(numbers in the parentheses).</p
The most enriched GO categories of modular enrichment in each gene clusters uncovered by LRR from yeast dataset.
<p>The columns of the table summarize the total sizes of the module (numbers in parentheses), the number of genes annotated in the cluster, the GO categories associated with the cluster, and the <i>P</i>-value after FDR correction.</p
ROC curves for synthetic data. (SNR denotes the signal-to-noise ratio).
<p>ROC curves for synthetic data. (SNR denotes the signal-to-noise ratio).</p
Singular enrichment of GO (or KEGG) categories in gene clusters uncovered by LRR from yeast dataset.
<p>Only significantly enriched functional categories (corrected <i>P</i>-value<10<sup>−20</sup>) are presented. The columns of the table summarize the total sizes of the cluster (numbers in parentheses), the number of annotated genes in the cluster, the <i>P</i>-value after FDR correction, and the GO categories associated with the cluster.</p
- …