34 research outputs found

    Bootstrapping Non-Parallel Voice Conversion From Speaker-Adaptive Text-to-Speech

    Get PDF
    Voice conversion (VC) and text-to-speech (TTS) are two tasks that share a similar objective, generating speech with a target voice. However, they are usually developed independently under vastly different frameworks. In this paper, we propose a methodology to bootstrap a VC system from a pretrained speaker-adaptive TTS model and unify the techniques as well as the interpretations of these two tasks. Moreover by offloading the heavy data demand to the training stage of the TTS model, our VC system can be built using a small amount of target speaker speech data. It also opens up the possibility of using speech in a foreign unseen language to build the system. Our subjective evaluations show that the proposed framework is able to not only achieve competitive performance in the standard intra-language scenario but also adapt and convert using speech utterances in an unseen language.Comment: Accepted for IEEE ASRU 201

    Multimodal Speech Synthesis Architecture for Unsupervised Speaker Adaptation

    Get PDF
    This paper proposes a new architecture for speaker adaptation of multi-speaker neural-network speech synthesis systems, in which an unseen speaker's voice can be built using a relatively small amount of speech data without transcriptions. This is sometimes called "unsupervised speaker adaptation". More specifically, we concatenate the layers to the audio inputs when performing unsupervised speaker adaptation while we concatenate them to the text inputs when synthesizing speech from text. Two new training schemes for the new architecture are also proposed in this paper. These training schemes are not limited to speech synthesis, other applications are suggested. Experimental results show that the proposed model not only enables adaptation to unseen speakers using untranscribed speech but it also improves the performance of multi-speaker modeling and speaker adaptation using transcribed audio files.Comment: Accepted for Interspeech 2018, Indi

    Scaling and Bias Codes for Modeling Speaker-Adaptive DNN-based Speech Synthesis Systems

    Get PDF
    Most neural-network based speaker-adaptive acoustic models for speech synthesis can be categorized into either layer-based or input-code approaches. Although both approaches have their own pros and cons, most existing works on speaker adaptation focus on improving one or the other. In this paper, after we first systematically overview the common principles of neural-network based speaker-adaptive models, we show that these approaches can be represented in a unified framework and can be generalized further. More specifically, we introduce the use of scaling and bias codes as generalized means for speaker-adaptive transformation. By utilizing these codes, we can create a more efficient factorized speaker-adaptive model and capture advantages of both approaches while reducing their disadvantages. The experiments show that the proposed method can improve the performance of speaker adaptation compared with speaker adaptation based on the conventional input code.Comment: Accepted for 2018 IEEE Workshop on Spoken Language Technology (SLT), Athens, Greec

    Investigating accuracy of pitch-accent annotations in neural network-based speech synthesis and denoising effects

    Get PDF
    We investigated the impact of noisy linguistic features on the performance of a Japanese speech synthesis system based on neural network that uses WaveNet vocoder. We compared an ideal system that uses manually corrected linguistic features including phoneme and prosodic information in training and test sets against a few other systems that use corrupted linguistic features. Both subjective and objective results demonstrate that corrupted linguistic features, especially those in the test set, affected the ideal system's performance significantly in a statistical sense due to a mismatched condition between the training and test sets. Interestingly, while an utterance-level Turing test showed that listeners had a difficult time differentiating synthetic speech from natural speech, it further indicated that adding noise to the linguistic features in the training set can partially reduce the effect of the mismatch, regularize the model, and help the system perform better when linguistic features of the test set are noisy.Comment: Accepted for Interspeech 201

    Effects of plant essential oils and their constituents on Helicobacter pylori : A Review

    Get PDF
    Essential oils (EOs) obtained from different medicinal and aromatic plant families by steam distillation have been used in the pharmaceutical, food, and fragrance industries. The plant EOs and their broad diversity of chemical components have attracted researchers worldwide due to their human health benefits and antibacterial properties, especially their treatment of Helicobacter pylori infection. Since H. pylori has been known to be responsible for various gastric and duodenal diseases such as atrophic gastritis, peptic ulcer, gastric adenocarcinoma, and mucosa-associated lymphoid tissue lymphoma, several combination antibiotic therapies have been increasingly used to enhance the eradication rate of the bacterial infection. However, in the last decades, the efficacy of the therapies has decreased significantly due to widespread emergence of multidrug resistant strains of H. pylori. In addition, side-effects from commonly used antibiotics and recurrence of the bacterial infection have drawn public health concern globally.Therefore, this review focuses on in vitro effects of plant EOs and their bioactive constituents on the growth, cell morphology and integrity, biofilm formation, motility, adhesion, and urease activity of H. pylori. Their inhibitory effects on expression of genes necessary for growth and virulence factor productions of the bacterial pathogen are also discussed. Further in vivo and clinical evaluations are required so that plant EOs and their bioactive constituents can be possibly applicable in pharmacy or as adjuvants to the current therapies of H. pylori infection
    corecore