390 research outputs found
Automatsko raspoznavanje hrvatskoga govora velikoga vokabulara
This paper presents procedures used for development of a Croatian large vocabulary automatic speech recognition system (LVASR). The proposed acoustic model is based on context-dependent triphone hidden Markov models and Croatian phonetic rules. Different acoustic and language models, developed using a large collection of Croatian speech, are discussed and compared. The paper proposes the best feature vectors and acoustic modeling procedures using which lowest word error rates for Croatian speech are achieved. In addition, Croatian language modeling procedures are evaluated and adopted for speaker independent spontaneous speech recognition. Presented experiments and results show that the proposed approach for automatic speech recognition using context-dependent acoustic modeling based on Croatian phonetic rules and a parameter tying procedure can be used for efļ¬cient Croatian large vocabulary speech recognition with word error rates below 5%.Älanak prikazuje postupke akustiÄkog i jeziÄnog modeliranja sustava za automatsko raspoznavanje hrvatskoga govora velikoga vokabulara. Predloženi akustiÄki modeli su zasnovani na kontekstno-ovisnim skrivenim Markovljevim modelima trifona i hrvatskim fonetskim pravilima. Na hrvatskome govoru prikupljenom u korpusu su ocjenjeni i usporeÄeni razliÄiti akustiÄki i jeziÄni modeli. U Älanku su usporeÄ eni i predloženi postupci za izraÄun vektora znaÄajki za akustiÄko modeliranje kao i sam pristup akustiÄkome modeliranju hrvatskoga govora s kojim je postignuta najmanja mjera pogreÅ”no raspoznatih rijeÄi. Predstavljeni su rezultati raspoznavanja spontanog hrvatskog govora neovisni o govorniku. Postignuti rezultati eksperimenata s mjerom pogreÅ”ke ispod 5% ukazuju na primjerenost predloženih postupaka za automatsko raspoznavanje hrvatskoga govora velikoga vokabulara pomoÄu vezanih kontekstnoovisnih akustiÄkih modela na osnovu hrvatskih fonetskih pravila
An HMM-Based Formalism for Automatic Subword Unit Derivation and Pronunciation Generation
We propose a novel hidden Markov model (HMM) formalism for automatic derivation of subword units and pronunciation generation using only transcribed speech data. In this approach, the subword units are derived from the clustered context-dependent units in a grapheme based system using maximum-likelihood criterion. The subword unit based pronunciations are then learned in the framework of Kullback-Leibler divergence based HMM. The automatic speech recognition (ASR) experiments on WSJ0 English corpus show that the approach leads to 12.7 % relative reduction in word error rate compared to grapheme-based system. Our approach can be bene-ficial in reducing the need for expert knowledge in development of ASR as well as text-to-speech systems. Index Terms ā automatic subword unit derivation, pronuncia-tion generation, hidden Markov model, Kullback-Leibler divergence based hidden Markov model 1
Wavelet-based techniques for speech recognition
In this thesis, new wavelet-based techniques have been developed for the
extraction of features from speech signals for the purpose of automatic speech
recognition (ASR). One of the advantages of the wavelet transform over the short
time Fourier transform (STFT) is its capability to process non-stationary signals.
Since speech signals are not strictly stationary the wavelet transform is a better
choice for time-frequency transformation of these signals. In addition it has
compactly supported basis functions, thereby reducing the amount of
computation as opposed to STFT where an overlapping window is needed. [Continues.
- ā¦