813 research outputs found
Projected Estimators for Robust Semi-supervised Classification
For semi-supervised techniques to be applied safely in practice we at least
want methods to outperform their supervised counterparts. We study this
question for classification using the well-known quadratic surrogate loss
function. Using a projection of the supervised estimate onto a set of
constraints imposed by the unlabeled data, we find we can safely improve over
the supervised solution in terms of this quadratic loss. Unlike other
approaches to semi-supervised learning, the procedure does not rely on
assumptions that are not intrinsic to the classifier at hand. It is
theoretically demonstrated that, measured on the labeled and unlabeled training
data, this semi-supervised procedure never gives a lower quadratic loss than
the supervised alternative. To our knowledge this is the first approach that
offers such strong, albeit conservative, guarantees for improvement over the
supervised solution. The characteristics of our approach are explicated using
benchmark datasets to further understand the similarities and differences
between the quadratic loss criterion used in the theoretical results and the
classification accuracy often considered in practice.Comment: 13 pages, 2 figures, 1 tabl
Discussion on the paper: Hypotheses testing by convex optimization by Goldenshluger, Juditsky and Nemirovski
We briefly discuss some interesting questions related to the paper
"Hypotheses testing by convex optimization" by Goldenshluger, Juditsky and
Nemirovski.Comment: To appear in the EJ
Residual Weighted Learning for Estimating Individualized Treatment Rules
Personalized medicine has received increasing attention among statisticians,
computer scientists, and clinical practitioners. A major component of
personalized medicine is the estimation of individualized treatment rules
(ITRs). Recently, Zhao et al. (2012) proposed outcome weighted learning (OWL)
to construct ITRs that directly optimize the clinical outcome. Although OWL
opens the door to introducing machine learning techniques to optimal treatment
regimes, it still has some problems in performance. In this article, we propose
a general framework, called Residual Weighted Learning (RWL), to improve finite
sample performance. Unlike OWL which weights misclassification errors by
clinical outcomes, RWL weights these errors by residuals of the outcome from a
regression fit on clinical covariates excluding treatment assignment. We
utilize the smoothed ramp loss function in RWL, and provide a difference of
convex (d.c.) algorithm to solve the corresponding non-convex optimization
problem. By estimating residuals with linear models or generalized linear
models, RWL can effectively deal with different types of outcomes, such as
continuous, binary and count outcomes. We also propose variable selection
methods for linear and nonlinear rules, respectively, to further improve the
performance. We show that the resulting estimator of the treatment rule is
consistent. We further obtain a rate of convergence for the difference between
the expected outcome using the estimated ITR and that of the optimal treatment
rule. The performance of the proposed RWL methods is illustrated in simulation
studies and in an analysis of cystic fibrosis clinical trial data.Comment: 48 pages, 3 figure
- …