47 research outputs found

    Real vs. acted emotional speech

    Get PDF

    Incremental perception of acted and real emotional speech

    Get PDF

    Incremental perception of acted and real emotional speech

    Get PDF

    Audiovisual Emotional Speech of Game Playing Children:Effects of Age and Culture

    Get PDF

    End-to-end Recurrent Denoising Autoencoder Embeddings for Speaker Identification

    Get PDF
    Speech 'in-the-wild' is a handicap for speaker recognition systems due to the variability induced by real-life conditions, such as environmental noise and emotions in the speaker. Taking advantage of representation learning, on this paper we aim to design a recurrent denoising autoencoder that extracts robust speaker embeddings from noisy spectrograms to perform speaker identification. The end-to-end proposed architecture uses a feedback loop to encode information regarding the speaker into low-dimensional representations extracted by a spectrogram denoising autoencoder. We employ data augmentation techniques by additively corrupting clean speech with real life environmental noise and make use of a database with real stressed speech. We prove that the joint optimization of both the denoiser and the speaker identification module outperforms independent optimization of both modules under stress and noise distortions as well as hand-crafted features.Comment: 8 pages + 2 of references + 5 of images. Submitted on Monday 20th of July to Elsevier Signal Processing Short Communication
    corecore