Search CORE

896 research outputs found

HMM-based speech synthesiser using the LF-model of the glottal source

Author: Cabral J.
Renals Steve
Richmond K.
Yamagishi J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2011
Field of study

A major factor which causes a deterioration in speech quality in HMM-based speech synthesis is the use of a simple delta pulse signal to generate the excitation of voiced speech. This paper sets out a new approach to using an acoustic glottal source model in HMM-based synthesisers instead of the traditional pulse signal. The goal is to improve speech quality and to better model and transform voice characteristics. We have found the new method decreases buzziness and also improves prosodic modelling. A perceptual evaluation has supported this finding by showing a 55.6 % preference for the new system, as against the baseline. This improvement, while not being as significant as we had initially expected, does encourage us to work on developing the proposed speech synthesiser further

CiteSeerX

Crossref

Edinburgh Research Explorer

Kalman tracking of linear predictor and harmonic noise models for noisy speech enhancement

Author: Ben Milner
Boll
Chen
Deller
Ephraim
Ephraim
Ephraim
Ephraim
Esfandiar Zavarehei
Friedman
Griffin
Hansen
Ioannis Andrianakis
Jonathan Darch
Kalman
Lim
Lim
Paul White
Qin Yan
Rentzos
Saeed Vaseghi
Sameti
Secrest
Seltzer
Stylianou
Stylianou
Tucker
Turunen
Vaseghi
Weber
Yan
Publication venue: 'Elsevier BV'
Publication date: 01/01/2008
Field of study

This paper presents a speech enhancement method based on the tracking and denoising of the formants of a linear prediction (LP) model of the spectral envelope of speech and the parameters of a harmonic noise model (HNM) of its excitation. The main advantages of tracking and denoising the prominent energy contours of speech are the efficient use of the spectral and temporal structures of successive speech frames and a mitigation of processing artefact known as the ‘musical noise’ or ‘musical tones’.The formant-tracking linear prediction (FTLP) model estimation consists of three stages: (a) speech pre-cleaning based on a spectral amplitude estimation, (b) formant-tracking across successive speech frames using the Viterbi method, and (c) Kalman filtering of the formant trajectories across successive speech frames.The HNM parameters for the excitation signal comprise; voiced/unvoiced decision, the fundamental frequency, the harmonics’ amplitudes and the variance of the noise component of excitation. A frequency-domain pitch extraction method is proposed that searches for the peak signal to noise ratios (SNRs) at the harmonics. For each speech frame several pitch candidates are calculated. An estimate of the pitch trajectory across successive frames is obtained using a Viterbi decoder. The trajectories of the noisy excitation harmonics across successive speech frames are modeled and denoised using Kalman filters.The proposed method is used to deconstruct noisy speech, de-noise its model parameters and then reconstitute speech from its cleaned parts. Experimental evaluations show the performance gains of the formant tracking, pitch extraction and noise reduction stages

Crossref

Southampton (e-Prints Soton)

University of East Anglia digital repository

Glottal Spectral Separation for Speech Synthesis

Author: Cabral João P
Renals Steve
Richmond Korin
Yamagishi Junichi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2014
Field of study

Edinburgh Research Explorer

Combining perceptually-motivated spectral shaping with loudness and duration modification for intelligibility enhancement of HMM-based synthetic speech in noise

Author: King S.
Stylianou Y.
Valentini-Botinhao C.
Yamagishi J.
Publication venue
Publication date: 01/08/2013
Field of study

Edinburgh Research Explorer

HMM-Based Speech Synthesis Utilizing Glottal Inverse Filtering

Author: Alku P.
Nurminen J.
Pulakka H.
Raitio T.
Suni A.
Vainio M.
Yamagishi J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

Edinburgh Research Explorer