1,001 research outputs found
Class Proportion Estimation with Application to Multiclass Anomaly Rejection
This work addresses two classification problems that fall under the heading
of domain adaptation, wherein the distributions of training and testing
examples differ. The first problem studied is that of class proportion
estimation, which is the problem of estimating the class proportions in an
unlabeled testing data set given labeled examples of each class. Compared to
previous work on this problem, our approach has the novel feature that it does
not require labeled training data from one of the classes. This property allows
us to address the second domain adaptation problem, namely, multiclass anomaly
rejection. Here, the goal is to design a classifier that has the option of
assigning a "reject" label, indicating that the instance did not arise from a
class present in the training data. We establish consistent learning strategies
for both of these domain adaptation problems, which to our knowledge are the
first of their kind. We also implement the class proportion estimation
technique and demonstrate its performance on several benchmark data sets.Comment: Accepted to AISTATS 2014. 15 pages. 2 figure
Convex Calibration Dimension for Multiclass Loss Matrices
We study consistency properties of surrogate loss functions for general
multiclass learning problems, defined by a general multiclass loss matrix. We
extend the notion of classification calibration, which has been studied for
binary and multiclass 0-1 classification problems (and for certain other
specific learning problems), to the general multiclass setting, and derive
necessary and sufficient conditions for a surrogate loss to be calibrated with
respect to a loss matrix in this setting. We then introduce the notion of
convex calibration dimension of a multiclass loss matrix, which measures the
smallest `size' of a prediction space in which it is possible to design a
convex surrogate that is calibrated with respect to the loss matrix. We derive
both upper and lower bounds on this quantity, and use these results to analyze
various loss matrices. In particular, we apply our framework to study various
subset ranking losses, and use the convex calibration dimension as a tool to
show both the existence and non-existence of various types of convex calibrated
surrogates for these losses. Our results strengthen recent results of Duchi et
al. (2010) and Calauzenes et al. (2012) on the non-existence of certain types
of convex calibrated surrogates in subset ranking. We anticipate the convex
calibration dimension may prove to be a useful tool in the study and design of
surrogate losses for general multiclass learning problems.Comment: Accepted to JMLR, pending editin
Least Ambiguous Set-Valued Classifiers with Bounded Error Levels
In most classification tasks there are observations that are ambiguous and
therefore difficult to correctly label. Set-valued classifiers output sets of
plausible labels rather than a single label, thereby giving a more appropriate
and informative treatment to the labeling of ambiguous instances. We introduce
a framework for multiclass set-valued classification, where the classifiers
guarantee user-defined levels of coverage or confidence (the probability that
the true label is contained in the set) while minimizing the ambiguity (the
expected size of the output). We first derive oracle classifiers assuming the
true distribution to be known. We show that the oracle classifiers are obtained
from level sets of the functions that define the conditional probability of
each class. Then we develop estimators with good asymptotic and finite sample
properties. The proposed estimators build on existing single-label classifiers.
The optimal classifier can sometimes output the empty set, but we provide two
solutions to fix this issue that are suitable for various practical needs.Comment: Final version to be published in the Journal of the American
Statistical Association at
https://www.tandfonline.com/doi/abs/10.1080/01621459.2017.1395341?journalCode=uasa2
- …