Due to the high variation in the application requirements of sound event
detection (SED) systems, it is not sufficient to evaluate systems only in a
single operating mode. Therefore, the community recently adopted the polyphonic
sound detection score (PSDS) as an evaluation metric, which is the normalized
area under the PSD receiver operating characteristic (PSD-ROC). It summarizes
the system performance over a range of operating modes resulting from varying
the decision threshold that is used to translate the system output scores into
a binary detection output. Hence, it provides a more complete picture of the
overall system behavior and is less biased by specific threshold tuning.
However, besides the decision threshold there is also the post-processing that
can be changed to enter another operating mode. In this paper we propose the
post-processing independent PSDS (piPSDS) as a generalization of the PSDS.
Here, the post-processing independent PSD-ROC includes operating points from
varying post-processings with varying decision thresholds. Thus, it summarizes
even more operating modes of an SED system and allows for system comparison
without the need of implementing a post-processing and without a bias due to
different post-processings. While piPSDS can in principle combine different
types of post-processing, we hear, as a first step, present median filter
independent PSDS (miPSDS) results for this year's DCASE Challenge Task4a
systems. Source code is publicly available in our sed_scores_eval package
(https://github.com/fgnt/sed_scores_eval).Comment: submitted to DCASE Workshop 202