10 research outputs found
Additional file 1: of Whole genome sequences are required to fully resolve the linkage disequilibrium structure of human populations
Additional material as referenced in the text. (PDF 257 kb
Relationship between weights under a linear classifier and chi-square values used in feature selection.
<p>SVM models were constructed on 542 study samples with genotype data for a subset of 200 SNPs chosen based on ER+/− association, determined from the chi-square statistic. SNP feature weights were obtained from the linear SVM model and used as an indicator of the importance of each feature for classification; SNPs with the largest absolute weight values are the most important for classification. Chi-square values used in feature selection and SVM classifier weight values are uncorrelated; Pearson’s correlation coefficient r = −0.026. SNPs with absolute weight values > 0.5 are annotated with the name of the gene in which they reside or are in closest proximity to.</p
ROC curves for ER+ and ER− classification using linear and RBF kernels.
<p>ROC curves and area under ROC curve (AUC) values can be used as more robust measures of classifier accuracy beyond overall classification accuracy. (A) ROC curves for ER+ classification. (B) ROC curves for ER− classification. In both cases the linear model is represented by a dashed line and the RBF kernel model is represented by a solid line. The point on each curve corresponds to the true positive/negative and false positive/negative values obtained from 100 iterations of 10-fold cross-validation carried out on 542 samples with 200 SNP features. The ROC curve for any meaningful classifier needs to lie above the y = x line; the case where equal proportions of cases would be classified correctly and incorrectly, as would occur if class values were assigned at random.</p
Significant enrichment of genes in KEGG pathway identified by DAVID.
<p>Significant enrichment of genes in KEGG pathway identified by DAVID.</p
DAVID Annotation Clusters: Enriched gene ontology (GO) terms from the ER+/− classification.
<p>DAVID Annotation Clusters: Enriched gene ontology (GO) terms from the ER+/− classification.</p
Weka kernels and classification results using 200 SNPs with the strongest ER+/− association.
<p>Weka kernels and classification results using 200 SNPs with the strongest ER+/− association.</p
Replication of most significant associations from the discovery set meta-analysis in the replication samples.
<p>Results are presented for those SNPs which remained associated in the same direction in the validation set as in the discovery set (adjusted for ER-status).</p><p>Replication of most significant associations from the discovery set meta-analysis in the replication samples.</p
Associations of SNPs with nominal replication signals with clinical characteristics associated with breast cancer in a pooled set of discovery and replication cohorts.
<p>N-stage = metastasis to lymph node, M-stage = metastasis stage and T-stage = Tumour stage.</p><p>Associations of SNPs with nominal replication signals with clinical characteristics associated with breast cancer in a pooled set of discovery and replication cohorts.</p
Manhattan plot of results from genome wide meta-analysis of POSH stage-1 and HEBCS hazard ratios and 95% confidence intervals.
<p>The 25 most associated SNPs are highlighted in green.</p
Kaplan-Meier plots depicting breast cancer related survival in response to rs421379 genotypes in pooled POSH stage-1, HEBCS and POSH stage-2 samples.
<p>Kaplan-Meier plots depicting breast cancer related survival in response to rs421379 genotypes in pooled POSH stage-1, HEBCS and POSH stage-2 samples.</p