3,747 research outputs found
Hierarchical Subquery Evaluation for Active Learning on a Graph
To train good supervised and semi-supervised object classifiers, it is
critical that we not waste the time of the human experts who are providing the
training labels. Existing active learning strategies can have uneven
performance, being efficient on some datasets but wasteful on others, or
inconsistent just between runs on the same dataset. We propose perplexity based
graph construction and a new hierarchical subquery evaluation algorithm to
combat this variability, and to release the potential of Expected Error
Reduction.
Under some specific circumstances, Expected Error Reduction has been one of
the strongest-performing informativeness criteria for active learning. Until
now, it has also been prohibitively costly to compute for sizeable datasets. We
demonstrate our highly practical algorithm, comparing it to other active
learning measures on classification datasets that vary in sparsity,
dimensionality, and size. Our algorithm is consistent over multiple runs and
achieves high accuracy, while querying the human expert for labels at a
frequency that matches their desired time budget.Comment: CVPR 201
Black-Box Batch Active Learning for Regression
Batch active learning is a popular approach for efficiently training machine
learning models on large, initially unlabelled datasets by repeatedly acquiring
labels for batches of data points. However, many recent batch active learning
methods are white-box approaches and are often limited to differentiable
parametric models: they score unlabeled points using acquisition functions
based on model embeddings or first- and second-order derivatives. In this
paper, we propose black-box batch active learning for regression tasks as an
extension of white-box approaches. Crucially, our method only relies on model
predictions. This approach is compatible with a wide range of machine learning
models, including regular and Bayesian deep learning models and
non-differentiable models such as random forests. It is rooted in Bayesian
principles and utilizes recent kernel-based approaches. This allows us to
extend a wide range of existing state-of-the-art white-box batch active
learning methods (BADGE, BAIT, LCMD) to black-box models. We demonstrate the
effectiveness of our approach through extensive experimental evaluations on
regression datasets, achieving surprisingly strong performance compared to
white-box approaches for deep learning models.Comment: 12 pages + 11 pages appendi
- …