PhD ThesisThe primary objective of a Sound Event Detection (SED) system is to detect the prescene
of an acoustic event (i.e., audio tagging) and to return the onset and offset of the identified acoustic event within an audio clip (i.e., temporal localization). Such a system
can be promising in wildlife and biodiversity monitoring, surveillance, and smart-home
applications.
However, developing a system to be adept at both subtasks is not a trivial task. It can
be hindered by the need for a large amount of strongly labeled data, where the event tags
and the corresponding onsets and offsets are known with certainty. This is a limiting factor
as strongly labeled data is challenging to collect and is prone to annotation errors due to
the ambiguity in the perception of onsets and offsets.
In this thesis, we propose to address the lack of strongly labeled data by using pseudo
strongly labeled data, where the event tags are known with certainty while the corresponding onsets and offsets are estimated. While Nonnegative Matrix Factorization can be
used directly for SED but with limited accuracy, we show that it can be a useful tool
for pseudo labeling. We further show that pseudo strongly labeled data estimated using
our proposed methods can improve the accuracy of a SED system developed using deep
learning approaches.
Subsequent work then focused on improving a SED system as a whole rather than a
single subtask. This leads to the proposal of a novel student-teacher training framework
that incorporates a noise-robust loss function, a new cyclic training scheme, an improved
depthwise separable convolution, a triple instance-level temporal pooling approach, and an
improved Transformer encoding layer. Together with synthetic strongly labeled data and a
large corpus of unlabeled data, we show that a SED system developed using our proposed
method is capable of producing state-of-the-art performance
Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.