Search CORE

40 research outputs found

Algorithm 1: Pseudo code for computing the AUC-PR based on the continuous interpolation.

Author: Ivo Grosse (19487)
Jan Grau (19484)
Jens Keilwagen (230589)
Publication venue
Publication date
Field of study

Initially, we choose the classification threshold such that the number of true positives is equal to the total number of positives. Then we iterate as long as the number of true positives – and, hence, recall – is greater than . We determine the new point by choosing the next existing score as classification threshold. Unless this threshold leads to an identical number of true positives, we compute the values of , , and as defined by <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0092209#pone.0092209.e110" target="_blank">equation (6</a>), and set the borders of the integration. We use these values to compute the AUC between the current points and , and proceed with the while-loop. After termination of the loop, holds the AUC-PR.</p

The Francis Crick Institute

PR and ROC curves and respective AUC values for weighted and unweighted data.

Author: Ivo Grosse (19487)
Jan Grau (19484)
Jens Keilwagen (230589)
Publication venue
Publication date
Field of study

Panel (a) show a histogram of foreground weights () for all data points. The dashed line indicates the threshold used to separate foreground and background data points in the unweighted case. Panel (b) presents a histogram of classification scores. Within the bars of the histogram, we visualize the number of data points from the foreground (green) and background (red) class according to the unweighted case. Panel (c) presents classification performance using unweighted data computed from the classification scores presented in panel (b). Panel (d) visualizes the relationship between classification scores and weights for the hypothetical good, permuted, and bad classifiers. All three orderings of classification scores share the same underlying distribution as shown in panel (b). Panel (e) show the clearly distinguishable classification performance of the three classifiers as measured by ROC and PR curves using weighted data. The corresponding AUC values are listed in panel (f).</p

The Francis Crick Institute

Precision recall curves for data set with 100 data points and class ratio 1 to 4.

Author: Ivo Grosse (19487)
Jan Grau (19484)
Jens Keilwagen (230589)
Publication venue
Publication date
Field of study

The blue and the red curve indicate estimators of the best and the worst curve, respectively. The gray curves represent 1,000 PR curves based on a random scored-based classifications, which are also summarized by the green boxplots. The pink dashed line indicates the level of the class ratio .</p

The Francis Crick Institute

Mean results for AUC-ROC and AUC-PR on PBM data sets using unweighted or weighted test data.

Author: Ivo Grosse (19487)
Jan Grau (19484)
Jens Keilwagen (230589)
Publication venue
Publication date
Field of study

The team name and the ranking is depicted on the abscissa, while the mean result for AUC-ROC and AUC-PR is depicted on the ordinate. Teams are displayed in the order of the original ranking of Weirauch et al. <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0092209#pone.0092209-Weirauch1" target="_blank">[27]</a>.</p

The Francis Crick Institute

Comparison of PR curves using unweighted and weighted test data for one exemplary data set (11) of [27].

Author: Ivo Grosse (19487)
Jan Grau (19484)
Jens Keilwagen (230589)
Publication venue
Publication date
Field of study

In panel (a), we plot the predicted log-intensity values of classifiers A, D, and E against the measured log-intensity values. Panel (b) visualizes the class border in the unweighted case (red line) and the weights of the foreground class () in the weighted case. In panel (c), we show the PR curves of the three classifiers using unweighted (left) and weighted (right) test data.</p

The Francis Crick Institute

Differences of AUC-PR between the interpolations for varying size of the foreground data set.

Author: Ivo Grosse (19487)
Jan Grau (19484)
Jens Keilwagen (230589)
Publication venue
Publication date
Field of study

Panel (a) depicts the results for 10 bins equivalent to at most 10 different classification scores, whereas panel (b) depicts the results for 1,000 bins.</p

The Francis Crick Institute

Comparison of ranking classifiers by AUC-PR using unweighted and weighted test data for query 29 from [18].

Author: Ivo Grosse (19487)
Jan Grau (19484)
Jens Keilwagen (230589)
Publication venue
Publication date
Field of study

The AUC-PR for unweighted test data is depicted in black, whereas the AUC-PR for weighted test data is depicted in red.</p

The Francis Crick Institute

Comparison of AUC-PR values for different classification thresholds.

Author: Ivo Grosse (19487)
Jan Grau (19484)
Jens Keilwagen (230589)
Publication venue
Publication date
Field of study

In panel (a), we consider unweighted test data and plot the AUC-PR values for a threshold of mean intensity plus four times standard deviation (ordinate) against the AUC-PR values for a threshold of mean intensity plus four times standard deviation (abscissa). In panel (b), we consider weighted test data and plot the AUC-PR values in analogy to panel (a). We find a substantially greater Pearson correlation between the AUC-PR values for the different thresholds for weighted data compared to unweighted data.</p

The Francis Crick Institute

Binary confusion matrix.

Author: Ivo Grosse (19487)
Jan Grau (19484)
Jens Keilwagen (230589)
Publication venue
Publication date
Field of study

The confusion matrix can be computed for weighted and unweighted data. For unweighted data each data point contributes with a weight of one, whereas for weighted data each data point contributes with its specific weight for the given class.</p

The Francis Crick Institute

Classification for unweighted and weighted data.

Author: Ivo Grosse (19487)
Jan Grau (19484)
Jens Keilwagen (230589)
Publication venue
Publication date
Field of study

The entries of a confusion matrix have been calculated for a classification threshold of 1.5. In case of unweighted data, the class label is if and otherwise .</p

The Francis Crick Institute