Meta-learning algorithms for active learning are emerging as a promising
paradigm for learning the ``best'' active learning strategy. However, current
learning-based active learning approaches still require sufficient training
data so as to generalize meta-learning models for active learning. This is
contrary to the nature of active learning which typically starts with a small
number of labeled samples. The unavailability of large amounts of labeled
samples for training meta-learning models would inevitably lead to poor
performance (e.g., instabilities and overfitting). In our paper, we tackle
these issues by proposing a novel learning-based active learning framework,
called Learning To Sample (LTS). This framework has two key components: a
sampling model and a boosting model, which can mutually learn from each other
in iterations to improve the performance of each other. Within this framework,
the sampling model incorporates uncertainty sampling and diversity sampling
into a unified process for optimization, enabling us to actively select the
most representative and informative samples based on an optimized integration
of uncertainty and diversity. To evaluate the effectiveness of the LTS
framework, we have conducted extensive experiments on three different
classification tasks: image classification, salary level prediction, and entity
resolution. The experimental results show that our LTS framework significantly
outperforms all the baselines when the label budget is limited, especially for
datasets with highly imbalanced classes. In addition to this, our LTS framework
can effectively tackle the cold start problem occurring in many existing active
learning approaches.Comment: Accepted by ICDM'1