2 research outputs found

    Expressive Speech Synthesis for Critical Situations

    Get PDF
    Presence of appropriate acoustic cues of affective features in the synthesized speech can be a prerequisite for the proper evaluation of the semantic content by the message recipient. In the recent work the authors have focused on the research of expressive speech synthesis capable of generating naturally sounding synthetic speech at various levels of arousal. Automatic information and warning systems can be used to inform, warn, instruct and navigate people in dangerous, critical situations, and increase the effectiveness of crisis management and rescue operations. One of the activities in the frame of the EU SF project CRISIS was called "Extremely expressive (hyper-expressive) speech synthesis for urgent warning messages generation''. It was aimed at research and development of speech synthesizers with high naturalness and intelligibility capable of generating messages with various expressive loads. The synthesizers will be applicable to generate public alert and warning messages in case of fires, floods, state security threats, etc. Early warning in relation to the situations mentioned above can be made thanks to fire and flood spread forecasting; modeling thereof is covered by other activities of the CRISIS project. The most important part needed for the synthesizer building is the expressive speech database. An original method is proposed to create such a database. The current version of the expressive speech database is introduced and first experiments with expressive synthesizers developed with this database are presented and discussed

    Parametric synthesis of expressive speech

    Get PDF
    U disertaciji su opisani postupci sinteze ekspresivnog govora korišćenjem parametarskih pristupa. Pokazano je da se korišćenjem dubokih neuronskih mreža dobijaju bolji rezultati nego korišćenjem skrivenix Markovljevih modela. Predložene su tri nove metode za sintezu ekspresivnog govora korišćenjem dubokih neuronskih mreža: metoda kodova stila, metoda dodatne obuke mreže i arhitektura zasnovana na deljenim skrivenim slojevima. Pokazano je da se najbolji rezultati dobijaju korišćenjem metode kodova stila. Takođe je predložana i nova metoda za transplantaciju emocija/stilova bazirana na deljenim skrivenim slojevima. Predložena metoda ocenjena je bolje od referentne metode iz literature.In this thesis methods for expressive speech synthesis using parametric approaches are presented. It is shown that better results are achived with usage of deep neural networks compared to synthesis based on hidden Markov models. Three new methods for synthesis of expresive speech using deep neural networks are presented: style codes, model re-training and shared hidden layer architecture. It is shown that best results are achived by using style code method. The new method for style transplantation based on shared hidden layer architecture is also proposed. It is shown that this method outperforms referent method from literature
    corecore