707 research outputs found
Surrogate regret bounds for generalized classification performance metrics
We consider optimization of generalized performance metrics for binary
classification by means of surrogate losses. We focus on a class of metrics,
which are linear-fractional functions of the false positive and false negative
rates (examples of which include -measure, Jaccard similarity
coefficient, AM measure, and many others). Our analysis concerns the following
two-step procedure. First, a real-valued function is learned by minimizing
a surrogate loss for binary classification on the training sample. It is
assumed that the surrogate loss is a strongly proper composite loss function
(examples of which include logistic loss, squared-error loss, exponential loss,
etc.). Then, given , a threshold is tuned on a separate
validation sample, by direct optimization of the target performance metric. We
show that the regret of the resulting classifier (obtained from thresholding
on ) measured with respect to the target metric is
upperbounded by the regret of measured with respect to the surrogate loss.
We also extend our results to cover multilabel classification and provide
regret bounds for micro- and macro-averaging measures. Our findings are further
analyzed in a computational study on both synthetic and real data sets.Comment: 22 page
RankSEG: A Consistent Ranking-based Framework for Segmentation
Segmentation has emerged as a fundamental field of computer vision and
natural language processing, which assigns a label to every pixel/feature to
extract regions of interest from an image/text. To evaluate the performance of
segmentation, the Dice and IoU metrics are used to measure the degree of
overlap between the ground truth and the predicted segmentation. In this paper,
we establish a theoretical foundation of segmentation with respect to the
Dice/IoU metrics, including the Bayes rule and Dice-/IoU-calibration, analogous
to classification-calibration or Fisher consistency in classification. We prove
that the existing thresholding-based framework with most operating losses are
not consistent with respect to the Dice/IoU metrics, and thus may lead to a
suboptimal solution. To address this pitfall, we propose a novel consistent
ranking-based framework, namely RankDice/RankIoU, inspired by plug-in rules of
the Bayes segmentation rule. Three numerical algorithms with GPU parallel
execution are developed to implement the proposed framework in large-scale and
high-dimensional segmentation. We study statistical properties of the proposed
framework. We show it is Dice-/IoU-calibrated, and its excess risk bounds and
the rate of convergence are also provided. The numerical effectiveness of
RankDice/mRankDice is demonstrated in various simulated examples and
Fine-annotated CityScapes, Pascal VOC and Kvasir-SEG datasets with
state-of-the-art deep learning architectures.Comment: 50 page
- …