Despite consistent advancement in powerful deep learning techniques in recent
years, large amounts of training data are still necessary for the models to
avoid overfitting. Synthetic datasets using generative adversarial networks
(GAN) have recently been generated to overcome this problem. Nevertheless,
despite advancements, GAN-based methods are usually hard to train or fail to
generate high-quality data samples. In this paper, we propose an environmental
sound classification augmentation technique based on the diffusion
probabilistic model with DPM-Solver++ for fast sampling. In addition, to
ensure the quality of the generated spectrograms, we train a top-k selection
discriminator on the dataset. According to the experiment results, the
synthesized spectrograms have similar features to the original dataset and can
significantly increase the classification accuracy of different
state-of-the-art models compared with traditional data augmentation techniques.
The public code is available on
https://github.com/JNAIC/DPMs-for-Audio-Data-Augmentation