Search CORE

5 research outputs found

On Usable Speech Detection by Linear Multi-Scale Decomposition for Speaker Identification

Author: Ben Braiek Ezzedine
Ben Slimane Amel
Ghezaiel Wajdi
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/12/2016
Field of study

Usable speech is a novel concept of processing co-channel speech data. It is proposed to extract minimally corrupted speech that is considered useful for various speech processing systems. In this paper, we are interested for co-channel speaker identification (SID). We employ a new proposed usable speech extraction method based on the pitch information obtained from linear multi-scale decomposition by discrete wavelet transform. The idea is to retain the speech segments that have only one pitch detected and remove the others. Detected Usable speech was used as input for speaker identification system. The system is evaluated on co-channel speech and results show a significant improvement across various Target to Interferer Ratio (TIR) for speaker identification system

IAES journal

Crossref

Institute of Advanced Engineering and Science

OVERLAPPED-SPEECH DETECTION WITH APPLICATIONS TO DRIVER ASSESSMENT FOR IN-VEHICLE ACTIVE SAFETY SYSTEMS

Author: Amardeep Sathyanarayana
John H L Hansen
Navid Shokouhi
Seyed Omid Sadjadi
Publication venue
Publication date: 03/04/2020
Field of study

ABSTRACT In this study we propose a system for overlapped-speech detection. Spectral harmonicity and envelope features are extracted to represent overlapped and single-speaker speech using Gaussian mixture models (GMM). The system is shown to effectively discriminate the single and overlapped speech classes. We further increase the discrimination by proposing a phoneme selection scheme to generate more reliable artificial overlapped data for model training. Evaluations on artificially generated co-channel data show that the novelty in feature selection and phoneme omission results in a relative improvement of 10% in the detection accuracy compared to baseline. As an example application, we evaluate the effectiveness of overlapped-speech detection for vehicular environments and its potential in assessing driver alertness. Results indicate a good correlation between driver performance and the amount and location of overlapped-speech segments

CiteSeerX

Hierachical methods for large population speaker identification using telephone speech

Author: Lerato Lerato
Publication venue: Department of Electrical Engineering
Publication date: 01/01/2003
Field of study

This study focuses on speaker identificat ion. Several problems such as acoustic noise, channel noise, speaker variability, large population of known group of speakers wi thin the system and many others limit good SiD performance. The SiD system extracts speaker specific features from digitised speech signa] for accurate identification. These feature sets are clustered to form the speaker template known as a speaker model. As the number of speakers enrolling into the system gets larger, more models accumulate and the interspeaker confusion results. This study proposes the hierarchical methods which aim to split the large population of enrolled speakers into smaller groups of model databases for minimising interspeaker confusion

Cape Town University OpenUCT