Search CORE

22 research outputs found

Robust speech recognition with spectrogram factorisation

Author: Hurmalainen Antti
Publication venue: Tampere University of Technology
Publication date: 01/01/2014
Field of study

Communication by speech is intrinsic for humans. Since the breakthrough of mobile devices and wireless communication, digital transmission of speech has become ubiquitous. Similarly distribution and storage of audio and video data has increased rapidly. However, despite being technically capable to record and process audio signals, only a fraction of digital systems and services are actually able to work with spoken input, that is, to operate on the lexical content of speech. One persistent obstacle for practical deployment of automatic speech recognition systems is inadequate robustness against noise and other interferences, which regularly corrupt signals recorded in real-world environments. Speech and diverse noises are both complex signals, which are not trivially separable. Despite decades of research and a multitude of different approaches, the problem has not been solved to a sufficient extent. Especially the mathematically ill-posed problem of separating multiple sources from a single-channel input requires advanced models and algorithms to be solvable. One promising path is using a composite model of long-context atoms to represent a mixture of non-stationary sources based on their spectro-temporal behaviour. Algorithms derived from the family of non-negative matrix factorisations have been applied to such problems to separate and recognise individual sources like speech. This thesis describes a set of tools developed for non-negative modelling of audio spectrograms, especially involving speech and real-world noise sources. An overview is provided to the complete framework starting from model and feature definitions, advancing to factorisation algorithms, and finally describing different routes for separation, enhancement, and recognition tasks. Current issues and their potential solutions are discussed both theoretically and from a practical point of view. The included publications describe factorisation-based recognition systems, which have been evaluated on publicly available speech corpora in order to determine the efficiency of various separation and recognition algorithms. Several variants and system combinations that have been proposed in literature are also discussed. The work covers a broad span of factorisation-based system components, which together aim at providing a practically viable solution to robust processing and recognition of speech in everyday situations

Trepo - Institutional Repository of Tampere University

Detection, Separation and Recognition of Speech From Continuous Signals Using Spectral Factorisation

Author: Gemmeke Jort
Hurmalainen Antti
Virtanen Tuomas
Publication venue
Publication date
Field of study

Publication in the conference proceedings of EUSIPCO, Bucharest, Romania, 201

ZENODO

Modelling non-stationary noise with spectral factorisation in automatic speech recognition

Author: Acero
Antti Hurmalainen
Barker
Cichocki
Cooke
Delcroix
Demuynck
Gales
Gemmeke
Gemmeke
Heittola
Hershey
Hurmalainen
Hurmalainen
Hurmalainen
Jort F. Gemmeke
Kinoshita
Lee
Maas
Mahkonen
Ming
Mysore
O’Grady
Raj
Schmidth
Smaragdis
Sundaram
Tuomas Virtanen
Van Segbroeck
Vipperla
Virtanen
Virtanen
Wachter
Wachter
Wang
Wang
Weninger
Weninger
Wilson
Wilson
Young
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Robust speech recognition with spectrogram factorisation

Author: Hurmalainen Antti
Publication venue: Tampere University of Technology
Publication date: 01/01/2014
Field of study

TamPub Julkaisuarkisto - TamPub Institutional Repository

Trepo - Institutional Repository of Tampere University

State-based labelling for a sparse representation of speech and its application to robust speech recognition

Author: Gemmeke Jort
Hurmalainen Antti
Virtanen Tuomas
Publication venue: ISCA-INT SPEECH COMMUNICATION ASSOC
Publication date: 01/01/2010
Field of study

status: publishe

Lirias

HMM-regularization for NMF-based noise robust ASR

Author: Gemmeke Jort
Hurmalainen Antti
Virtanen Tuomas
Publication venue
Publication date: 01/01/2013
Field of study

Gemmeke J.F., Hurmalainen A., Virtanen T., ''HMM-regularization for NMF-based noise robust ASR'', Proceedings 2nd international workshop on machine listening in multisource environments - CHiME 2013 (in conjunction with ICASSP 2013), pp. 47-52, June 1, 2013, Vancouver, Canada.status: publishe

Lirias

Exemplar-Based Sparse Representations for Noise Robust Automatic Speech Recognition

Author: Gemmeke Jort
Hurmalainen Antti
Virtanen Tuomas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2011
Field of study

status: publishe

Lirias

Modelling non-stationary noise with spectral factorisation in automatic speech recognition

Author: Gemmeke Jort
Hurmalainen Antti
Virtanen Tuomas
Publication venue: 'Elsevier BV'
Publication date: 01/05/2013
Field of study

Hurmalainen A., Gemmeke J.F., Virtanen T., ''Modelling non-stationary noise with spectral factorisation in automatic speech recognition'', Computer speech and language, vol. 27, no. 3, pp. 763-779, May 2013.status: publishe

Lirias

Compact long context spectral factorisation models for noise robust recognition of medium vocabulary speech

Author: Gemmeke Jort
Hurmalainen Antti
Virtanen Tuomas
Publication venue
Publication date: 01/01/2013
Field of study

Hurmalainen A., Gemmeke J.F., Virtanen T., ''Compact long context spectral factorisation models for noise robust recognition of medium vocabulary speech'', Proceedings 2nd international workshop on machine listening in multisource environments - CHiME 2013 (in conjunction with ICASSP 2013), pp. 13-18, June 1, 2013, Vancouver, Canada.status: publishe

Lirias

Non-negative matrix deconvolution in noise robust speech recognition

Author: Gemmeke Jort
Hurmalainen Antti
Virtanen Tuomas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

status: publishe

Lirias