19 research outputs found
The properties and sizes of the datasets used.
<p>N Seqs is the number of clusters that contained at least ten sequences.</p><p>Clusters with fewer than ten sequences were excluded from the analysis due to excessively small sample size.</p
The mean AUROC of all algorithms on all datasets using independent holdout data.
<p>This validation is unbiased.</p
Large-scale integration of cancer microarray data identifies a robust common cancer signature-0
<p><b>Copyright information:</b></p><p>Taken from "Large-scale integration of cancer microarray data identifies a robust common cancer signature"</p><p>http://www.biomedcentral.com/1471-2105/8/275</p><p>BMC Bioinformatics 2007;8():275-275.</p><p>Published online 30 Jul 2007</p><p>PMCID:PMC1950528.</p><p></p>n) is used to illustrate the gene expression values of the signature genes in the figure. The heatmap is generated by the matrix2png software [24]. For each data set, the expression value for each gene is normalized across the samples to zero mean and one standard deviation (SD) for visualization purposes. Genes with expression levels greater than the mean are colored in red and those below the mean are colored in green. The scale indicates the number of SDs above or below the mean
The fraction of variance in dimer frequency across sequences explained by expression profile or transcription factor binding sequence set and associated F statistic P-value.
<p>For the Human Cmap data, this was assessed both for the 2,000 nucleotides upstream of the coding start site and for the intron sequences.</p
The mean AUROC of all algorithms on all datasets based on training and testing on the same data.
<p>The optimistic bias reveals massive overfitting.</p
Generative models are too null.
<p>Panel (a): Quantile plot of Meme E-values for approximately 15,000 random runs, with E-values excluded. The X-axis represents the E-value as reported by MEME. The Y-axis represents the quantile. For example, under our null model E-values below are reported with probability slightly more than . Panels (b) and (c): Quantile plots of LR false discovery rates, similar to the Meme E-value quantile plots, for the Beer et al. and Human Cmap datasets respectively. Panel (d): Z-score plots of A/T fraction of yeast and human intergenic sequences relative to the distribution expected under a 6th order Markov model, with the standard normal distribution (red) shown for reference.</p
The mean holdout AUROC of the LR and ALR algorithms for motifs for non-significant (FDR0.05) and significant (FDR0.05) motifs respectively.
<p>The mean holdout AUROC of the LR and ALR algorithms for motifs for non-significant (FDR0.05) and significant (FDR0.05) motifs respectively.</p
Merging microarray data from separate breast cancer studies provides a robust prognostic test-1
Eat map is generated using the matrix2png software [34]. There are 80 rows corresponding to the 80 gene pairs; the displayed intensities are the differences between the expression values of the two genes in each pair. The expression value for each difference is normalized across the samples to zero mean and one standard deviation (SD) for visualization purposes. Differences with expression levels greater than the mean are colored in red and those below the mean are colored in green. The scale indicates the number of SDs above or below the mean.<p><b>Copyright information:</b></p><p>Taken from "Merging microarray data from separate breast cancer studies provides a robust prognostic test"</p><p>http://www.biomedcentral.com/1471-2105/9/125</p><p>BMC Bioinformatics 2008;9():125-125.</p><p>Published online 27 Feb 2008</p><p>PMCID:PMC2409450.</p><p></p
Merging microarray data from separate breast cancer studies provides a robust prognostic test-2
Itan patients between the good-outcome group and the poor-outcome group. The LRT is based on the integrated data in (A) and the single, Wang data set in (B). CI denotes confidence interval and the -value is calculated by the log-rank test.<p><b>Copyright information:</b></p><p>Taken from "Merging microarray data from separate breast cancer studies provides a robust prognostic test"</p><p>http://www.biomedcentral.com/1471-2105/9/125</p><p>BMC Bioinformatics 2008;9():125-125.</p><p>Published online 27 Feb 2008</p><p>PMCID:PMC2409450.</p><p></p
Inter-study validation and randomized cross-validation performance.
<p>The graphs show ISV and RCV results from SVM (A) and ISSAC (B). For clarity, the Study ID labels have been excluded from this visualization (see <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0110840#pone.0110840.s002" target="_blank">Text S1</a> for expanded versions of these plots that include the individual Study ID labels). The colored bars report sensitivities achieved on the validation study designated in the horizontal axis (e.g., the bar on the farthest left in (A) shows that 74% of ADC samples in the first ADC study are correctly classified by SVM when that study is excluded from training). The order of studies in the horizontal axis is identical for panels (A) and (B). Dashed lines represent average ISV sensitivities for each phenotype. Solid lines report corresponding ten-fold RCV sensitivities of each phenotype.</p