Search CORE

3 research outputs found

Significance of Vowel Onset Point Information for Speaker Verification

Author: Pradhan Gayadhar
Prasanna S. R. Mahadeva
Publication venue: Institute for Project Management Pvt. Ltd
Publication date: 27/08/2020
Field of study

This work demonstrates the significance of information about vowel onset points (VOPs) for speaker verification. VOP is defined as the instant at which the onset of vowel takes place. Vowel-like regions can be identified using VOPs. By production, vowel-like regions have impulse-like excitation and therefore impulse-response of vocal tract system is better manifested in them, and are relatively high signal to noise ratio (SNR) regions. Speaker information extracted from such regions may therefore be more discriminative. Due to this better speaker modeling and reliable testing may be possible using the features extracted from vowel-like regions. It is demonstrated in this work that for clean and matched conditions, relatively less number of frames from vowel-like regions are sufficient for speaker modeling and testing. Alternatively, for degraded and mismatched conditions, vowel-like regions provide better performanc

Interscience Research Network

Design of algorithm for segmentation of speech utterances in patients with Huntington's disease

Author: Pospíšil Jakub
Publication venue: Czech Technical University in Prague. Computing and Information Centre.
Publication date: 28/05/2015
Field of study

Tato diplomová práce se zabývá problematikou diagnostiky hypokinetiské dysartrie jako prvotní příznak Huntingtonovi nemoci (HN). K vyhodnocování kvality řečového aparátu jsou používány řečové diadochokinetické (DDK) úlohy, založené na rychlém opakování slabik /pa/-/ta/-/ka/. Hlavním tématem je realizace algoritmu pro segmentaci patologických promluv pacientů trpící touto nemocí. Metoda předpokládá, že řeč obsahující explozívy, vokály a části bez řečové aktivity lze považovat za multimodální směs normálních rozdělení parametrů počtu průchodů nulou, spektrální entropie, vlnové transformace či rozptylu autokorelační funkce, tzv. směs Gaussovských rozdělení (Gaussian Mixture Model - GMM). Pro klasifikaci parametrů je využito GMM-algoritmu. Výstupem algoritmu jsou hranice jednotlivých explozív, vokálů a částí bez řečové aktivity. Dále jsou segmenty hodnoceny vhodnými řečovými příznaky, podle kterých jsou odlišeny nahrávky zdravého člověka od pacienta s HN.This thesis deals with problem of diagnosis hypokinetic dysarthria in Huntington's disease (HD). To evaluate the quality of the speech apparatus are used speech diadochokinetic (DDK) tasks, based on the repetition of syllables /pa/-/ta/-/ka/. Main aim is implementation of the algorithm for segmentation pathological utterances of patients suffering from HD. The method assumes that speech comprising plosives, vocals and parts without speech activity is considered to be multi-modal mixture normal distribution of zero-crossing rate, spectral entropy, wavelet transform, or variance of autocorrelation function, ie. Gaussian Mixture Model - GMM. The method sequentially estimates parameters of individual classes using the GMM-algorithm. Plosives, vocals, and parts without speech activity boundaries are outputs of the algorithm. Segmented utterances are evaluated appropriate speech symptoms. These symptoms distinguish records of healthy people form patients with HD

Digital Library of the Czech Technical University in Prague