2,510 research outputs found
Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation
Data augmentation is proven to be effective in many NLU tasks, especially for
those suffering from data scarcity. In this paper, we present a powerful and
easy to deploy text augmentation framework, Data Boost, which augments data
through reinforcement learning guided conditional generation. We evaluate Data
Boost on three diverse text classification tasks under five different
classifier architectures. The result shows that Data Boost can boost the
performance of classifiers especially in low-resource data scenarios. For
instance, Data Boost improves F1 for the three tasks by 8.7% on average when
given only 10% of the whole data for training. We also compare Data Boost with
six prior text augmentation methods. Through human evaluations (N=178), we
confirm that Data Boost augmentation has comparable quality as the original
data with respect to readability and class consistency.Comment: In proceedings of the 2020 Conference on Empirical Methods in Natural
Language Processing (EMNLP 2020). Onlin
- …