615 research outputs found
Recommended from our members
Designing Consistent and Convex Surrogates for General Prediction Tasks
Supervised machine learning algorithms are often predicated on the minimization of loss functions which measure error of a given prediction against a ground truth label. The choice of loss function to minimize corresponds to a summary statistic of the underlying data distribution that is learned in this process. Historically, loss function design has often been ad-hoc, and often results in losses that are not actually statistically consistent with respect to the target prediction task. This work focuses on the design of losses that are simultaneously convex, consistent with respect to a target prediction task, and efficient in the dimension of the prediction space. We provide frameworks to construct such losses in both discrete prediction and continuous estimation settings, as well as tools to lower bound the prediction dimension for certain classes of consistent convex losses. We apply our results throughout to understand prediction tasks such as high-confidence classification, top-k prediction, variance estimation, conditional value at risk, and ratios of expectations
Jaccard Metric Losses: Optimizing the Jaccard Index with Soft Labels
IoU losses are surrogates that directly optimize the Jaccard index. In
semantic segmentation, leveraging IoU losses as part of the loss function is
shown to perform better with respect to the Jaccard index measure than
optimizing pixel-wise losses such as the cross-entropy loss alone. The most
notable IoU losses are the soft Jaccard loss and the Lovasz-Softmax loss.
However, these losses are incompatible with soft labels which are ubiquitous in
machine learning. In this paper, we propose Jaccard metric losses (JMLs), which
are identical to the soft Jaccard loss in a standard setting with hard labels,
but are compatible with soft labels. With JMLs, we study two of the most
popular use cases of soft labels: label smoothing and knowledge distillation.
With a variety of architectures, our experiments show significant improvements
over the cross-entropy loss on three semantic segmentation datasets
(Cityscapes, PASCAL VOC and DeepGlobe Land), and our simple approach
outperforms state-of-the-art knowledge distillation methods by a large margin.
Code is available at:
\href{https://github.com/zifuwanggg/JDTLosses}{https://github.com/zifuwanggg/JDTLosses}.Comment: Submitted to ICML2023. Code is available at
https://github.com/zifuwanggg/JDTLosse
Rank-based Decomposable Losses in Machine Learning: A Survey
Recent works have revealed an essential paradigm in designing loss functions
that differentiate individual losses vs. aggregate losses. The individual loss
measures the quality of the model on a sample, while the aggregate loss
combines individual losses/scores over each training sample. Both have a common
procedure that aggregates a set of individual values to a single numerical
value. The ranking order reflects the most fundamental relation among
individual values in designing losses. In addition, decomposability, in which a
loss can be decomposed into an ensemble of individual terms, becomes a
significant property of organizing losses/scores. This survey provides a
systematic and comprehensive review of rank-based decomposable losses in
machine learning. Specifically, we provide a new taxonomy of loss functions
that follows the perspectives of aggregate loss and individual loss. We
identify the aggregator to form such losses, which are examples of set
functions. We organize the rank-based decomposable losses into eight
categories. Following these categories, we review the literature on rank-based
aggregate losses and rank-based individual losses. We describe general formulas
for these losses and connect them with existing research topics. We also
suggest future research directions spanning unexplored, remaining, and emerging
issues in rank-based decomposable losses.Comment: Accepted by IEEE Transactions on Pattern Analysis and Machine
Intelligence (TPAMI
- …