1 research outputs found
End-to-End Classification of Reverberant Rooms using DNNs
Reverberation is present in our workplaces, our homes and even in places
designed as auditoria, such as concert halls and theatres. This work
investigates how deep learning can use the effect of reverberation on speech to
classify a recording in terms of the room in which it was recorded in.
Approaches previously taken in the literature for the task relied on handpicked
acoustic parameters as features used by classifiers. Estimating the values of
these parameters from reverberant speech involves estimation errors, inevitably
impacting the classification accuracy. This paper shows how DNNs can perform
the classification in an end-to-end fashion, therefore by operating directly on
reverberant speech. Based on the above, a method for the training of
generalisable DNN classifiers and a DNN architecture for the task are proposed.
A study is also made on the relationship between feature-maps derived by DNNs
and acoustic parameters that describe known properties of reverberation. In the
experiments shown, AIRs are used that were measured in 7 real rooms. The
classification accuracy of DNNs is compared between the case of having access
to the AIRs and the case of having access only to the reverberant speech
recorded in the same rooms. The experiments show that with access to the AIRs a
DNN achieves an accuracy of 99.1% and with access only to reverberant speech,
the proposed DNN achieves an accuracy of 86.9%. The experiments replicate the
testing procedure used in previous work, which relied on handpicked acoustic
parameters, allowing the direct evaluation of the benefit of using deep
learning.Comment: Submitted to IEEE/ACM Transactions on Audio, Speech, and Language
Processin