1 research outputs found
Power Pooling Operators and Confidence Learning for Semi-Supervised Sound Event Detection
In recent years, the involvement of synthetic strongly labeled data,weakly
labeled data and unlabeled data has drawn much research attentionin
semi-supervised sound event detection (SSED). Self-training models carry out
predictions without strong annotations and then take predictions with high
probabilities as pseudo-labels for retraining. Such models have shown its
effectiveness in SSED. However, probabilities are poorly calibrated confidence
estimates, and samples with low probabilities are ignored. Hence, we introduce
a method of learning confidence deliberately and retaining all data distinctly
by applying confidence as weights. Additionally, linear pooling has been
considered as a state-of-the-art aggregation function for SSED with weak
labeling. In this paper, we propose a power pooling function whose coefficient
can be trained automatically to achieve nonlinearity. A confidencebased
semi-supervised sound event detection (C-SSED) framework is designed to combine
confidence and power pooling. The experimental results demonstrate that
confidence is proportional to the accuracy of the predictions. The power
pooling function outperforms linear pooling at both error rate and F1 results.
In addition, the C-SSED framework achieves a relative error rate reduction of
34% in contrast to the baseline model