In this work we propose a pragmatic method that reduces the annotation cost
for structured label spaces using active learning. Our approach leverages
partial annotation, which reduces labeling costs for structured outputs by
selecting only the most informative sub-structures for annotation. We also
utilize self-training to incorporate the current model's automatic predictions
as pseudo-labels for un-annotated sub-structures. A key challenge in
effectively combining partial annotation with self-training to reduce
annotation cost is determining which sub-structures to select to label. To
address this challenge, we adopt an error estimator to adaptively decide the
partial selection ratio according to the current model's capability. In
evaluations spanning four structured prediction tasks, we show that our
combination of partial annotation and self-training using an adaptive selection
ratio reduces annotation cost over strong full annotation baselines under a
fair comparison scheme that takes reading time into consideration.Comment: Findings of EMNLP 202