Search CORE

6,863 research outputs found

Pitch modification techniques for sampled voice

Author: Brooks Michael
Publication venue
Publication date: 27/06/2018
Field of study

A Phase Vocoder based on Nonstationary Gabor Frames

Author: Dörfler Monika
Ottosen Emil Solsbæk
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

We propose a new algorithm for time stretching music signals based on the theory of nonstationary Gabor frames (NSGFs). The algorithm extends the techniques of the classical phase vocoder (PV) by incorporating adaptive time-frequency (TF) representations and adaptive phase locking. The adaptive TF representations imply good time resolution for the onsets of attack transients and good frequency resolution for the sinusoidal components. We estimate the phase values only at peak channels and the remaining phases are then locked to the values of the peaks in an adaptive manner. During attack transients we keep the stretch factor equal to one and we propose a new strategy for determining which channels are relevant for reinitializing the corresponding phase values. In contrast to previously published algorithms we use a non-uniform NSGF to obtain a low redundancy of the corresponding TF representation. We show that with just three times as many TF coefficients as signal samples, artifacts such as phasiness and transient smearing can be greatly reduced compared to the classical PV. The proposed algorithm is tested on both synthetic and real world signals and compared with state of the art algorithms in a reproducible manner.Comment: 10 pages, 6 figure

arXiv.org e-Print Archive

VBN

Wavenet based low rate speech coding

Author: Kleijn W. Bastiaan
Lim Felicia S. C.
Luebs Alejandro
Skoglund Jan
Stimberg Florian
Walters Thomas C.
Wang Quan
Publication venue
Publication date: 01/12/2017
Field of study

Traditional parametric coding of speech facilitates low rate but provides poor reconstruction quality because of the inadequacy of the model used. We describe how a WaveNet generative speech model can be used to generate high quality speech from the bit stream of a standard parametric coder operating at 2.4 kb/s. We compare this parametric coder with a waveform coder based on the same generative model and show that approximating the signal waveform incurs a large rate penalty. Our experiments confirm the high performance of the WaveNet based coder and show that the speech produced by the system is able to additionally perform implicit bandwidth extension and does not significantly impair recognition of the original speaker for the human listener, even when that speaker has not been used during the training of the generative model.Comment: 5 pages, 2 figure

arXiv.org e-Print Archive

Crossref

Time-scale modification for speech coding

Author: Burazerovic D.
Publication venue
Publication date: 31/08/2000
Field of study

Pure OAI Repository

Glottal Spectral Separation for Speech Synthesis

Author: Cabral João P
Renals Steve
Richmond Korin
Yamagishi Junichi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2014
Field of study

Edinburgh Research Explorer