4 research outputs found

    Early vs Late Fusion in Binaural Sound Source Localisation using CNN

    Get PDF
    In Binaural Sound Source Localisation there are two representations of the signals which contain useful cues for localisation: the time/phase frequency spectrum and the magnitude frequency spectrum. This typically leads to two branch CNN architectures being employed achieve localisation. This paper compares the difference in performance between models which employ early and later fusion of these two branches, finding only negligible differences and thus concluding that this is an unimportant consideration in the design of such systems

    Joint Direction and Proximity Classification of Overlapping Sound Events from Binaural Audio

    Get PDF
    Sound source proximity and distance estimation are of great interest in many practical applications, since they provide significant information for acoustic scene analysis. As both tasks share complementary qualities, ensuring efficient interaction between these two is crucial for a complete picture of an aural environment. In this paper, we aim to investigate several ways of performing joint proximity and direction estimation from binaural recordings, both defined as coarse classification problems based on Deep Neural Networks (DNNs). Considering the limitations of binaural audio, we propose two methods of splitting the sphere into angular areas in order to obtain a set of directional classes. For each method we study different model types to acquire information about the direction-of-arrival (DoA). Finally, we propose various ways of combining the proximity and direction estimation problems into a joint task providing temporal information about the onsets and offsets of the appearing sources. Experiments are performed for a synthetic reverberant binaural dataset consisting of up to two overlapping sound events.acceptedVersionPeer reviewe

    The Effect of Noise Reduction Upon Voiceprint Integrity

    Get PDF
    Audio evidence is often full of noise and it may be advantageous to apply noise-reduction in order to discern dialogue or other sounds; however, this risks damaging any audio cues used in voiceprint analysis. This paper seeks to assess the impact noise-reduction systems have through the application of noise reduction to a small sample, and analysis of a contained voiceprint. It was found that in adverse conditions and / or by application of extreme parameters, enough damage to the cues can be done to make voiceprint analysis ill-advised, and this technique should be reserved for recordings with favourable signal to noise ratios

    Multitask Learning of Time-Frequency CNN for Sound Source Localization

    No full text
    corecore