11 research outputs found

    Model architectures to extrapolate emotional expressions in DNN-based text-to-speech

    Get PDF
    This paper proposes architectures that facilitate the extrapolation of emotional expressions in deep neural network (DNN)-based text-to-speech (TTS). In this study, the meaning of “extrapolate emotional expressions” is to borrow emotional expressions from others, and the collection of emotional speech uttered by target speakers is unnecessary. Although a DNN has potential power to construct DNN-based TTS with emotional expressions and some DNN-based TTS systems have demonstrated satisfactory performances in the expression of the diversity of human speech, it is necessary and troublesome to collect emotional speech uttered by target speakers. To solve this issue, we propose architectures to separately train the speaker feature and the emotional feature and to synthesize speech with any combined quality of speakers and emotions. The architectures are parallel model (PM), serial model (SM), auxiliary input model (AIM), and hybrid models (PM&AIM and SM&AIM). These models are trained through emotional speech uttered by few speakers and neutral speech uttered by many speakers. Objective evaluations demonstrate that the performances in the open-emotion test provide insufficient information. They make a comparison with those in the closed-emotion test, but each speaker has their own manner of expressing emotion. However, subjective evaluation results indicate that the proposed models could convey emotional information to some extent. Notably, the PM can correctly convey sad and joyful emotions at a rate of >60%

    Parametric synthesis of expressive speech

    Get PDF
    U disertaciji su opisani postupci sinteze ekspresivnog govora korišćenjem parametarskih pristupa. Pokazano je da se korišćenjem dubokih neuronskih mreža dobijaju bolji rezultati nego korišćenjem skrivenix Markovljevih modela. Predložene su tri nove metode za sintezu ekspresivnog govora korišćenjem dubokih neuronskih mreža: metoda kodova stila, metoda dodatne obuke mreže i arhitektura zasnovana na deljenim skrivenim slojevima. Pokazano je da se najbolji rezultati dobijaju korišćenjem metode kodova stila. Takođe je predložana i nova metoda za transplantaciju emocija/stilova bazirana na deljenim skrivenim slojevima. Predložena metoda ocenjena je bolje od referentne metode iz literature.In this thesis methods for expressive speech synthesis using parametric approaches are presented. It is shown that better results are achived with usage of deep neural networks compared to synthesis based on hidden Markov models. Three new methods for synthesis of expresive speech using deep neural networks are presented: style codes, model re-training and shared hidden layer architecture. It is shown that best results are achived by using style code method. The new method for style transplantation based on shared hidden layer architecture is also proposed. It is shown that this method outperforms referent method from literature

    Expressive Multilingual Speech Synthesizer

    Get PDF
    Cilj istraživanja ove doktorske disertacije je da ispita mogućnost sintetizovanja govora glasom govornika na jeziku koji on nikada nije govorio. Kreirani su višejezični modeli, kako za jezike čiji je govorni materijal anotiran na isti način, tako i za one čiji je govorni materijal anotiran različitim konvencijama, što uključuje i srpski jezik. Po kvalitetu sintetizovanog govora neki modeli čak prevazilaze standardne modele obučene na govornom materijalu na jednom jeziku. Pored arhitekture za višejezične modele, predložen je i način adaptacije takvog modela na novog govornika. Takva adaptacija omogućuje brzu i jednostavnu produkciju novih glasova zadržavajući mogućnost sinteze na svim jezicima podržanim modelom, bez obzira na originalni jezik novog govornika.The aim of this thesis is to investigate the possibility of synthesizing speech in the voice of a speaker in a language which he had never spoken. Multilanguage models are created, both for the languages whose databases are annotated using the same conventions, and for the languages whose databases are annotated using different conventions, which includes the Serbian language. Regarding quality of synthesized speech, some models even surpass the quality of synthesis produced by standard monolanguage models. Beside architecture for multilanguage models, а method for adaptation of such models to the data of a new speaker is proposed. The proposed method of adaptation enables fast and simple production of new voices, while preserving the possibility to synthesize speech in any language supported by the model, regardless of the target speaker’s original language

    Smoking and Second Hand Smoking in Adolescents with Chronic Kidney Disease: A Report from the Chronic Kidney Disease in Children (CKiD) Cohort Study

    Get PDF
    The goal of this study was to determine the prevalence of smoking and second hand smoking [SHS] in adolescents with CKD and their relationship to baseline parameters at enrollment in the CKiD, observational cohort study of 600 children (aged 1-16 yrs) with Schwartz estimated GFR of 30-90 ml/min/1.73m2. 239 adolescents had self-report survey data on smoking and SHS exposure: 21 [9%] subjects had “ever” smoked a cigarette. Among them, 4 were current and 17 were former smokers. Hypertension was more prevalent in those that had “ever” smoked a cigarette (42%) compared to non-smokers (9%), p\u3c0.01. Among 218 non-smokers, 130 (59%) were male, 142 (65%) were Caucasian; 60 (28%) reported SHS exposure compared to 158 (72%) with no exposure. Non-smoker adolescents with SHS exposure were compared to those without SHS exposure. There was no racial, age, or gender differences between both groups. Baseline creatinine, diastolic hypertension, C reactive protein, lipid profile, GFR and hemoglobin were not statistically different. Significantly higher protein to creatinine ratio (0.90 vs. 0.53, p\u3c0.01) was observed in those exposed to SHS compared to those not exposed. Exposed adolescents were heavier than non-exposed adolescents (85th percentile vs. 55th percentile for BMI, p\u3c 0.01). Uncontrolled casual systolic hypertension was twice as prevalent among those exposed to SHS (16%) compared to those not exposed to SHS (7%), though the difference was not statistically significant (p= 0.07). Adjusted multivariate regression analysis [OR (95% CI)] showed that increased protein to creatinine ratio [1.34 (1.03, 1.75)] and higher BMI [1.14 (1.02, 1.29)] were independently associated with exposure to SHS among non-smoker adolescents. These results reveal that among adolescents with CKD, cigarette use is low and SHS is highly prevalent. The association of smoking with hypertension and SHS with increased proteinuria suggests a possible role of these factors in CKD progression and cardiovascular outcomes

    Annual Reports of the Department of the Interior for the fiscal year ended June 30, 1897; Annual Report of the Commissioner of Education, 1897.

    Get PDF
    Annual Report of the Sec. of Interior. 16 Nov. HD 5, 55-2, v12-22, 8978p. [3640-3650] Indian affairs; annual report of the Gen. Land Office (Serial 3640); annual report of the CIA (Serial 3641); etc

    Report of the Secretary of the Interior; being part of the message and documents communicated to the two Houses of Congress at the beginning of the second session of the Fifty-fourth Congress; Annual Report of the Commissioner of Education, 1896.

    Get PDF
    Annual Report of the Sec. of Interior.3 Dec. HD 5, 54-2, v12-19, 7567p. [3488-3495] Indian Affairs; annual report of the Gen. Land Office (Serial 348 8); annual report of the CIA (Serial 3489); etc