Search CORE

2 research outputs found

Adversarial Fine-tuning using Generated Respiratory Sound to Address Class Imbalance

Author: Bae Sangmin
Jung Ho-Young
Kim June-Woo
Toikkanen Miika
Yoon Chihyeon
Publication venue
Publication date: 11/11/2023
Field of study

Deep generative models have emerged as a promising approach in the medical image domain to address data scarcity. However, their use for sequential data like respiratory sounds is less explored. In this work, we propose a straightforward approach to augment imbalanced respiratory sound data using an audio diffusion model as a conditional neural vocoder. We also demonstrate a simple yet effective adversarial fine-tuning method to align features between the synthetic and real respiratory sound samples to improve respiratory sound classification performance. Our experimental results on the ICBHI dataset demonstrate that the proposed adversarial fine-tuning is effective, while only using the conventional augmentation method shows performance degradation. Moreover, our method outperforms the baseline by 2.24% on the ICBHI Score and improves the accuracy of the minority classes up to 26.58%. For the supplementary material, we provide the code at https://github.com/kaen2891/adversarial_fine-tuning_using_generated_respiratory_sound.Comment: accepted in NeurIPS 2023 Workshop on Deep Generative Models for Health (DGM4H

arXiv.org e-Print Archive

A Military Audio Dataset for Situational Awareness and Surveillance

Author: Chihyeon Yoon
Ho-Young Jung
June-Woo Kim
Publication venue: Nature Portfolio
Publication date: 01/06/2024
Field of study

Abstract Audio classification related to military activities is a challenging task due to the high levels of background noise and the lack of suitable and publicly available datasets. To bridge this gap, this paper constructs and introduces a new military audio dataset, named MAD, which is suitable for training and evaluating audio classification systems. The proposed MAD dataset is extracted from various military videos and contains 8,075 sound samples from 7 classes corresponding to approximately 12 hours, exhibiting distinctive characteristics not presented in academic datasets typically used for machine learning research. We present a comprehensive description of the dataset, including its acoustic statistics and examples. We further conduct a comprehensive sound classification study of various deep learning algorithms on the MAD dataset. We are also releasing the source code to make it easy to build these systems. The presented dataset will be a valuable resource for evaluating the performance of existing algorithms and for advancing research in the field of acoustic-based hazardous situation surveillance systems

Directory of Open Access Journals