28 research outputs found
Average AUC for each data set and algorithm over all <i>p</i>-value thresholds.
<p>All values are AUCs averaged over all <i>p</i>-value thresholds for each data set and algorithm. The last row shows the average AUC for each data set over all <i>p</i>-value thresholds and algorithms.</p><p>Average AUC for each data set and algorithm over all <i>p</i>-value thresholds.</p
Rank differences and <i>p</i>-values for pair-wise comparison of encodings.
<p>For each pair-wise comparison of encodings, the rank difference was calculated as the difference of the average ranks over all data sets and algorithms. The first encoding of each hypothesis is the better one (lower rank). <i>p</i>-values were corrected using Shaffer’s static method.</p><p>Rank differences and <i>p</i>-values for pair-wise comparison of encodings.</p
Rank differences and <i>p</i>-values for pair-wise comparison of classification algorithms.
<p>For each pair-wise comparison of classification algorithms, the rank difference was calculated as the difference of the average ranks over all data sets. The first algorithm of each hypothesis is the better one (lower rank). <i>p</i>-values were corrected using Shaffer’s static method.</p><p>Rank differences and <i>p</i>-values for pair-wise comparison of classification algorithms.</p
Average ranks of the seven classification algorithms.
<p>The average ranks of the Friedman test for the seven different classifiers using the additive encoding. (Small values are better.) The result of the Friedman test over all data sets is significant (<i>p</i> < 10<sup>−15</sup> for <i>k</i> = 7, <i>n</i> = 42). The table also shows the average ranks for each data set separately, but the Friedman test is not applicable here because the number of treatments is bigger than the number of problems (<i>k</i> = 7, <i>n</i> = 6).</p><p>Average ranks of the seven classification algorithms.</p
Comparison of classification algorithms.
<p>The seven classification algorithms compared by their rank distance over all disease data sets using the additive encoding. A connecting line between encodings means that the null hypothesis of them being significantly different could not be rejected (with <i>α</i> = 0.001).</p
Average number of SNPs reaching the specified <i>p</i>-value threshold per data set.
<p>Average number of SNPs reaching the specified <i>p</i>-value thresholds for at least one of the tests for genome-wide association. Numbers are averaged over all 10 results of the 5 × 2 cross-validations and rounded to one decimal place.</p><p>Average number of SNPs reaching the specified <i>p</i>-value threshold per data set.</p
Comparison of encodings per disease data set.
<p>The three encodings compared by their rank distance over all data sets and classifiers (a) and grouped by disease data set. A connecting line between encodings means that the null hypothesis of them being significantly different could not be rejected. Only data sets for which the Friedman test rejected the null hypothesis are shown. (<i>α</i> = 0.001.)</p
Evaluation of EFS-based and SR-based signatures on TG-GATEs data.
<p>The ROC curves obtained from different cross-validation folds were averaged based on the thresholds for class discrimination and drawn separately for each of the six classification methods. The classifiers evaluated here were trained on features selected using our (<b>A–C</b>) EFS methodology in conjunction with (<b>A</b>) the standard gene selection methods Golub-Ratio, PAM, SVM and RFE, (<b>B</b>) the statistical inference methods t-test, Wilcoxon rank-sum test and permutation test or (<b>C</b>) all previously stated methods. (<b>D</b>) The prediction accuracy was also determined for the SR signature-based models and the corresponding ROC curves were generated as described previously.</p
Expression profiles of EFS signature genes.
<p>Shown is a heatmap depicting the fold-changes of the top 10 informative genes from the predicted EFS signature. Rows represent genes and columns represent treatment groups. Cell colors indicate the strength and direction of differential expression relative to the corresponding control groups (red: upregulation, green: downregulation). Treatment groups which belong to different compound classes are separated by solid vertical lines. The respective classes are indicated by the color bar on top of the heatmap.</p
Separation and classification of compounds based on EFS and SR signature.
<p>(<b>A</b>) The dots correspond to different treatment groups and are colored according to the classes of the compounds used for treatment. Each treatment group was originally represented by a vector composed of the fold-changes of the 54 signature genes measured after 14 days of repeated dosing. In order to inspect the compound-specific expression profiles in a lower-dimensional space, these vectors were transformed to the first and second principal component resulting from PCA. In order to highlight clusters of NGCs and NCs, convex hulls were drawn around the respective compounds. The compounds WY, MP and MCT were considered as undefined, due to ambiguous outcomes of published studies. (<b>B</b>) PCA plot similar to (A), but generated on the basis of the SR signature. (<b>C</b>) The heatmaps depict the confidence of the predictions made by diverse classifiers for assessing the carcinogenic potential of GCs (AAF, DEN) and undefined compounds (MP, WY, MCT). Columns represent compounds and rows correspond to classifiers. The compound classes are indicated by the colorbar on top. The discrimination between carcinogens (blue) and non-carcinogens (green) was done based on the EFS signature. (<b>D</b>) Toxicogenomics-based assessment of the carcinogenic potential of GCs and undefined compounds using diverse classifiers which incorporate the SR signature genes as predictive features.</p