390 research outputs found

    Automatsko raspoznavanje hrvatskoga govora velikoga vokabulara

    Get PDF
    This paper presents procedures used for development of a Croatian large vocabulary automatic speech recognition system (LVASR). The proposed acoustic model is based on context-dependent triphone hidden Markov models and Croatian phonetic rules. Different acoustic and language models, developed using a large collection of Croatian speech, are discussed and compared. The paper proposes the best feature vectors and acoustic modeling procedures using which lowest word error rates for Croatian speech are achieved. In addition, Croatian language modeling procedures are evaluated and adopted for speaker independent spontaneous speech recognition. Presented experiments and results show that the proposed approach for automatic speech recognition using context-dependent acoustic modeling based on Croatian phonetic rules and a parameter tying procedure can be used for efļ¬cient Croatian large vocabulary speech recognition with word error rates below 5%.Članak prikazuje postupke akustičkog i jezičnog modeliranja sustava za automatsko raspoznavanje hrvatskoga govora velikoga vokabulara. Predloženi akustički modeli su zasnovani na kontekstno-ovisnim skrivenim Markovljevim modelima trifona i hrvatskim fonetskim pravilima. Na hrvatskome govoru prikupljenom u korpusu su ocjenjeni i uspoređeni različiti akustički i jezični modeli. U članku su uspoređ eni i predloženi postupci za izračun vektora značajki za akustičko modeliranje kao i sam pristup akustičkome modeliranju hrvatskoga govora s kojim je postignuta najmanja mjera pogreÅ”no raspoznatih riječi. Predstavljeni su rezultati raspoznavanja spontanog hrvatskog govora neovisni o govorniku. Postignuti rezultati eksperimenata s mjerom pogreÅ”ke ispod 5% ukazuju na primjerenost predloženih postupaka za automatsko raspoznavanje hrvatskoga govora velikoga vokabulara pomoću vezanih kontekstnoovisnih akustičkih modela na osnovu hrvatskih fonetskih pravila

    An HMM-Based Formalism for Automatic Subword Unit Derivation and Pronunciation Generation

    Get PDF
    We propose a novel hidden Markov model (HMM) formalism for automatic derivation of subword units and pronunciation generation using only transcribed speech data. In this approach, the subword units are derived from the clustered context-dependent units in a grapheme based system using maximum-likelihood criterion. The subword unit based pronunciations are then learned in the framework of Kullback-Leibler divergence based HMM. The automatic speech recognition (ASR) experiments on WSJ0 English corpus show that the approach leads to 12.7 % relative reduction in word error rate compared to grapheme-based system. Our approach can be bene-ficial in reducing the need for expert knowledge in development of ASR as well as text-to-speech systems. Index Terms ā€” automatic subword unit derivation, pronuncia-tion generation, hidden Markov model, Kullback-Leibler divergence based hidden Markov model 1

    Wavelet-based techniques for speech recognition

    Get PDF
    In this thesis, new wavelet-based techniques have been developed for the extraction of features from speech signals for the purpose of automatic speech recognition (ASR). One of the advantages of the wavelet transform over the short time Fourier transform (STFT) is its capability to process non-stationary signals. Since speech signals are not strictly stationary the wavelet transform is a better choice for time-frequency transformation of these signals. In addition it has compactly supported basis functions, thereby reducing the amount of computation as opposed to STFT where an overlapping window is needed. [Continues.
    • ā€¦
    corecore