Components Analysis on Audio Signal Mixtures

Abstract

This paper presents a novel multi-label noise classification algorithm that uses a convolutional neural network and applies a sliding window for classification. The existing noise classification method uses a convolutional neural network, in which the input audio must have a fixed time length. On the other hand, time-variant networks such as a time-delay neural network or a recurrent neural network can use any length of time, but have a limitation of classifying only a single label within a short time. Considering such shortcomings, we propose a windowing method that applies multi-label classification in overlapping time windows. For an audio stream with a duration that is longer than the audio stream inputs that the model trained with, the model applies a sliding window with multi-label classification to detect the corresponding classes in each time sequence. The model then identifies the final classes of the input by considering the confidence scores of each output label in each time sequence. The classification accuracy was 94.17% for single-label audio, 85.21% for two-class audio, and averaged 86.39% for audio of various durations.1

    Similar works

    Full text

    thumbnail-image

    Available Versions