Non-Parallel Voice Conversion System Using An Auto-Regressive Model

Di Martino, Joseph; Ezzine, Kadria; Frikha, Mondher

Non-Parallel Voice Conversion System Using An Auto-Regressive Model

Authors: Joseph Di Martino
Kadria Ezzine
Mondher Frikha
Publication date: 22 March 2022
Publisher: 'Institute of Electrical and Electronics Engineers (IEEE)'
Doi

Abstract

International audienceMuch existing voice conversion (VC) systems are attractive owing to their high performance in terms of voice quality and speaker similarity. Nevertheless, without parallel training data, some generated waveform trajectories are not yet smooth, leading to degraded sound quality and mispronunciation issues in the converted speech. To address these shortcomings, this paper proposes a non-parallel VC system based on an auto-regressive model, Phonetic PosteriorGrams (PPGs), and an LPCnet vocoder to generate high-quality converted speech. The proposed auto-regressive structure makes our system able to produce the next step outputs from the previous step acoustic features. Further, the use of PPGs aims to convert any unknown source speaker into a specific target speaker due to their speaker-independent properties. We evaluate the effectiveness of our system by performing any-to-one conversion pairs between native English speakers. Objective and subjective measures show that our method outperforms the best non-parallel VC method of Voice Conversion Challenge 2018 in terms of naturalness and speaker similarity

Similar works

Full text

Available Versions

INRIA a CCSD electronic archive server

oai:HAL:hal-03780125v1

Last time updated on 25/09/2022

HAL Descartes

oai:HAL:hal-03780125v1

Last time updated on 12/10/2022

Hal-Diderot

oai:HAL:hal-03780125v1

Last time updated on 23/09/2022