In this study, we introduce a new augmentation technique to enhance the
resilience of sound event classification (SEC) systems against device
variability through the use of CycleGAN. We also present a unique dataset to
evaluate this method. As SEC systems become increasingly common, it is crucial
that they work well with audio from diverse recording devices. Our method
addresses limited device diversity in training data by enabling unpaired
training to transform input spectrograms as if they are recorded on a different
device. Our experiments show that our approach outperforms existing methods in
generalization by 5.2% - 11.5% in weighted f1 score. Additionally, it surpasses
the current methods in adaptability across diverse recording devices by
achieving a 6.5% - 12.8% improvement in weighted f1 score.Comment: Accepted to ICASSP 202