Search CORE

3 research outputs found

Time-Frequency Masking: Linking Blind Source Separation and Robust Speech Recognition

Author: Marco K&#252
Roberto Togneri
Sven Nordholm
Publication venue: 'IntechOpen'
Publication date: 01/01/2008
Field of study

IntechOpen

Crossref

espace@Curtin

Towards a Hardware Realization of Time-Frequency Source Separation of Speech

Author: HARTE NAOMI
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

peer-reviewedThis paper presents preliminary work on a hardware implementation of a source separation algorithm employing time-frequency masking methods. DUET (Degenerate Unmixing Estimation Technique) has previously been shown to achieve excellent source separation in real time in software. The current work is a move towards a hardware realization of DUET that will allow integration of the algorithm into consumer devices. Initial stages involve investigating the performance of DUET when implemented in fixed-point arithmetic and a consideration of algorithmic changes to make DUET more amenable to implementation on a DSP processor. Performance is compared for floating-point and fixed-point implementations. A Weighted K-means clustering algorithm is presented as an alternative to gradient descent methods for peak tracking and demonstrated to achieve excellent performance without adversely affecting computational load. Preliminary performance figures are given for an implementation on a TMS320VC5510 DSK

Irish Universities

Speech Recognition

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

Directory of Open Access Books (DOAB)