13,951 research outputs found

    Voice Conversion

    Get PDF

    On transforming spectral peaks in voice conversion

    No full text
    International audienceThis paper explores the benefits of transforming spectral peaks in voice conversion. First, in examining classic GMMbased transformation with cepstral coefficients, we show that the lack of transformed data variance ("over-smoothing") can be related to the choice of spectral parameterization. Consequently, we propose an alternative parameterization using spectral peaks. The peaks are transformed using HMMs with Gaussian state distributions. Two learning variants and post-processing treating peak evolution in time are also examined. In comparing the different transformation approaches, spectral peaks are shown to offer higher interspeaker feature correlation and yield higher transformed data variance than their cepstral coefficient counterparts

    An introduction to statistical parametric speech synthesis

    Get PDF

    On the use of spectral peak parameters in voice conversion

    No full text
    International audienceThis paper addresses the problem of low transformed data variance, or "over-smoothing," in spectral transformation for Voice Conversion. In examining a classic GMM-based transformation with cepstral coefficients, we show that this problem lies, not only in the transformation model (as commonly assumed), but also in the choice of spectral parameterization. Consequently, we propose an alternative method for spectral transformation using spectral peaks and an HMM with Gaussian state distributions. The spectral peaks are shown to offer higher inter-speaker feature correlation and yield higher transformed data variance than their cepstral coefficient counterparts. Additionally, the accuracy of the transformed envelopes is examined

    Parallel and Limited Data Voice Conversion Using Stochastic Variational Deep Kernel Learning

    Full text link
    Typically, voice conversion is regarded as an engineering problem with limited training data. The reliance on massive amounts of data hinders the practical applicability of deep learning approaches, which have been extensively researched in recent years. On the other hand, statistical methods are effective with limited data but have difficulties in modelling complex mapping functions. This paper proposes a voice conversion method that works with limited data and is based on stochastic variational deep kernel learning (SVDKL). At the same time, SVDKL enables the use of deep neural networks' expressive capability as well as the high flexibility of the Gaussian process as a Bayesian and non-parametric method. When the conventional kernel is combined with the deep neural network, it is possible to estimate non-smooth and more complex functions. Furthermore, the model's sparse variational Gaussian process solves the scalability problem and, unlike the exact Gaussian process, allows for the learning of a global mapping function for the entire acoustic space. One of the most important aspects of the proposed scheme is that the model parameters are trained using marginal likelihood optimization, which considers both data fitting and model complexity. Considering the complexity of the model reduces the amount of training data by increasing the resistance to overfitting. To evaluate the proposed scheme, we examined the model's performance with approximately 80 seconds of training data. The results indicated that our method obtained a higher mean opinion score, smaller spectral distortion, and better preference tests than the compared methods

    Advanced Algorithms for Satellite Communication Signal Processing

    Get PDF
    Dizertační práce je zaměřena na softwarově definované přijímače určené k úzkopásmové družicové komunikaci. Komunikační kanály družicových spojů zahrnujících komunikaci s hlubokým vesmírem jsou zatíženy vysokými úrovněmi šumu, typicky modelovaného AWGN, a silným Dopplerovým posuvem signálu způsobeným mimořádnou rychlostí pohybu objektu. Dizertační práce představuje možné postupy řešení výpočetně efektivní digitální downkonverze úzkopásmových signálů a systému odhadu kmitočtu nosné úzkopásmových signálů zatížených Dopplerovým posuvem v řádu násobků šířky pásma signálu. Popis navrhovaných algoritmů zahrnuje analytický postup jejich vývoje a tam, kde je to možné, i analytické hodnocení jejich chování. Algoritmy jsou modelovány v prostředí MATLAB Simulink a tyto modely jsou využity pro ověření vlastností simulacemi. Modely byly také využity k experimentálním testům na reálném signálu přijatém z družice PSAT v laboratoři experimentálních družic na ústavu radioelektroniky.The dissertation is focused on software defined receivers intended for narrowband satellite communication. The satellite communication channel including deep space communication suffers from a high level of noise, typically modeled by AWGN, and from a strong Doppler shift of a signal caused by the unprecedented speed of an object in motion. The dissertation shows possible approaches to the issues of computationally efficient digital downconversion of narrowband signals and the carrier frequency estimation of narrowband signals distorted by the Doppler shift in the order of multiples of the signal bandwidth. The description of the proposed algorithms includes an analytical approach of its development and, if possible, the analytical performance assessment. The algorithms are modeled in MATLAB Simulink and the models are used for validating the performance by the simulation. The models were also used for experimental tests on the real signal received from the PSAT satellite at the laboratory of experimental satellites at the department of radio electronics.