10,863 research outputs found
Jaccard Metric Losses: Optimizing the Jaccard Index with Soft Labels
IoU losses are surrogates that directly optimize the Jaccard index. In
semantic segmentation, leveraging IoU losses as part of the loss function is
shown to perform better with respect to the Jaccard index measure than
optimizing pixel-wise losses such as the cross-entropy loss alone. The most
notable IoU losses are the soft Jaccard loss and the Lovasz-Softmax loss.
However, these losses are incompatible with soft labels which are ubiquitous in
machine learning. In this paper, we propose Jaccard metric losses (JMLs), which
are identical to the soft Jaccard loss in a standard setting with hard labels,
but are compatible with soft labels. With JMLs, we study two of the most
popular use cases of soft labels: label smoothing and knowledge distillation.
With a variety of architectures, our experiments show significant improvements
over the cross-entropy loss on three semantic segmentation datasets
(Cityscapes, PASCAL VOC and DeepGlobe Land), and our simple approach
outperforms state-of-the-art knowledge distillation methods by a large margin.
Code is available at:
\href{https://github.com/zifuwanggg/JDTLosses}{https://github.com/zifuwanggg/JDTLosses}.Comment: Submitted to ICML2023. Code is available at
https://github.com/zifuwanggg/JDTLosse
Fast Single-Class Classification and the Principle of Logit Separation
We consider neural network training, in applications in which there are many
possible classes, but at test-time, the task is a binary classification task of
determining whether the given example belongs to a specific class, where the
class of interest can be different each time the classifier is applied. For
instance, this is the case for real-time image search. We define the Single
Logit Classification (SLC) task: training the network so that at test-time, it
would be possible to accurately identify whether the example belongs to a given
class in a computationally efficient manner, based only on the output logit for
this class. We propose a natural principle, the Principle of Logit Separation,
as a guideline for choosing and designing losses suitable for the SLC. We show
that the cross-entropy loss function is not aligned with the Principle of Logit
Separation. In contrast, there are known loss functions, as well as novel batch
loss functions that we propose, which are aligned with this principle. In
total, we study seven loss functions. Our experiments show that indeed in
almost all cases, losses that are aligned with the Principle of Logit
Separation obtain at least 20% relative accuracy improvement in the SLC task
compared to losses that are not aligned with it, and sometimes considerably
more. Furthermore, we show that fast SLC does not cause any drop in binary
classification accuracy, compared to standard classification in which all
logits are computed, and yields a speedup which grows with the number of
classes. For instance, we demonstrate a 10x speedup when the number of classes
is 400,000. Tensorflow code for optimizing the new batch losses is publicly
available at https://github.com/cruvadom/Logit Separation.Comment: Published as a conference paper in ICDM 201
- …