Hypothesis-pruning maximizes the hypothesis updates for active learning to
find those desired unlabeled data. An inherent assumption is that this learning
manner can derive those updates into the optimal hypothesis. However, its
convergence may not be guaranteed well if those incremental updates are
negative and disordered. In this paper, we introduce a black-box teaching
hypothesis hT employing a tighter slack term
(1+FT(htβ))Ξtβ to replace
the typical 2Ξtβ for pruning. Theoretically, we prove that, under the
guidance of this teaching hypothesis, the learner can converge into a tighter
generalization error and label complexity bound than those non-educated
learners who do not receive any guidance from a teacher:1) the generalization
error upper bound can be reduced from R(hβ)+4ΞTβ1β to approximately
R(hT)+2ΞTβ1β, and 2) the label complexity upper bound can
be decreased from 4ΞΈ(TR(hβ)+2O(Tβ)) to
approximately 2ΞΈ(2TR(hT)+3O(Tβ)). To be
strict with our assumption, self-improvement of teaching is firstly proposed
when hT loosely approximates hβ. Against learning, we further
consider two teaching scenarios: teaching a white-box and black-box learner.
Experiments verify this idea and show better generalization performance than
the fundamental active learning strategies, such as IWAL, IWAL-D, etc