Search CORE

3 research outputs found

Vozeamento sintético de voz disfónica através da síntese digital de estruturas harmónicas em tempo real

Author: Nelio David de Freitas Gonçalves
Publication venue
Publication date: 17/07/2023
Field of study

Enhancement of esophageal speech using voice conversion techniques

Author: Ben Othmane Imen
Di Martino Joseph
Ouni Kaïs
Publication venue: HAL CCSD
Publication date: 05/12/2017
Field of study

International audienceThis paper presents a novel approach for enhancing esophageal speech using voice conversion techniques. Esophageal speech (ES) is an alternative voice that allows a patient with no vocal cords to produce sounds after total laryngectomy: this voice has a poor degree of intelligibility and a poor quality. To address this issue, we propose a speaking-aid system enhancing ES in order to clarify and make it more natural. Given the specificity of ES, in this study we propose to apply a new voice conversion technique taking into account the particularity of the pathological vocal apparatus. We trained deep neural networks (DNNs) and Gaussian mixture models (GMMs) to predict " laryngeal " vocal tract features from esophageal speech. The converted vectors are then used to estimate the excitation cepstral coefficients and phase by a search in the target training space previously encoded as a binary tree. The voice resynthesized sounds like a laryngeal voice i.e., is more natural than the original ES, with an effective reconstruction of the prosodic information while retaining , and this is the highlight of our study, the characteristics of the vocal tract inherent to the source speaker. The results of voice conversion evaluated using objective and subjective experiments , validate the proposed approach

INRIA a CCSD electronic archive server

Modelização de filtro de trato vocal para reconstrução de voz disfónica

Author: Marco António da Mota Oliveira
Publication venue
Publication date: 06/02/2020
Field of study

Análise e modelizadação da envolvente espectral para as nove vogais orais do português europeu padrão nos modos de fala vozeada e fala sussurrada. Desenvolvimento de modelos compactos no domínio espectral e no domínio cepstral das nove vogais orais do português padrão, orientados ao orador. Desenvolvimento e avaliação de um algoritmo protótipo de identificação de vogal sussurrada, orientado à operação em tempo real