1 research outputs found
Patch-Mix Contrastive Learning with Audio Spectrogram Transformer on Respiratory Sound Classification
Respiratory sound contains crucial information for the early diagnosis of
fatal lung diseases. Since the COVID-19 pandemic, there has been a growing
interest in contact-free medical care based on electronic stethoscopes. To this
end, cutting-edge deep learning models have been developed to diagnose lung
diseases; however, it is still challenging due to the scarcity of medical data.
In this study, we demonstrate that the pretrained model on large-scale visual
and audio datasets can be generalized to the respiratory sound classification
task. In addition, we introduce a straightforward Patch-Mix augmentation, which
randomly mixes patches between different samples, with Audio Spectrogram
Transformer (AST). We further propose a novel and effective Patch-Mix
Contrastive Learning to distinguish the mixed representations in the latent
space. Our method achieves state-of-the-art performance on the ICBHI dataset,
outperforming the prior leading score by an improvement of 4.08%.Comment: INTERSPEECH 2023, Code URL:
https://github.com/raymin0223/patch-mix_contrastive_learnin