Well-known for its simplicity and effectiveness in classification, AdaBoost,
however, suffers from overfitting when class-conditional distributions have
significant overlap. Moreover, it is very sensitive to noise that appears in
the labels. This article tackles the above limitations simultaneously via
optimizing a modified loss function (i.e., the conditional risk). The proposed
approach has the following two advantages. (1) It is able to directly take into
account label uncertainty with an associated label confidence. (2) It
introduces a "trustworthiness" measure on training samples via the Bayesian
risk rule, and hence the resulting classifier tends to have finite sample
performance that is superior to that of the original AdaBoost when there is a
large overlap between class conditional distributions. Theoretical properties
of the proposed method are investigated. Extensive experimental results using
synthetic data and real-world data sets from UCI machine learning repository
are provided. The empirical study shows the high competitiveness of the
proposed method in predication accuracy and robustness when compared with the
original AdaBoost and several existing robust AdaBoost algorithms.Comment: 14 Pages, 2 figures and 5 table