Obtaining large-scale labeled object detection dataset can be costly and
time-consuming, as it involves annotating images with bounding boxes and class
labels. Thus, some specialized active learning methods have been proposed to
reduce the cost by selecting either coarse-grained samples or fine-grained
instances from unlabeled data for labeling. However, the former approaches
suffer from redundant labeling, while the latter methods generally lead to
training instability and sampling bias. To address these challenges, we propose
a novel approach called Multi-scale Region-based Active Learning (MuRAL) for
object detection. MuRAL identifies informative regions of various scales to
reduce annotation costs for well-learned objects and improve training
performance. The informative region score is designed to consider both the
predicted confidence of instances and the distribution of each object category,
enabling our method to focus more on difficult-to-detect classes. Moreover,
MuRAL employs a scale-aware selection strategy that ensures diverse regions are
selected from different scales for labeling and downstream finetuning, which
enhances training stability. Our proposed method surpasses all existing
coarse-grained and fine-grained baselines on Cityscapes and MS COCO datasets,
and demonstrates significant improvement in difficult category performance