172 research outputs found

    The operation of an Emergency Medical Department in a county hospital from a logistic point of view

    Get PDF
    The purpose of this article is to give an overview of the actual emergency medical attendance through an exemplary hospital in Hungary, highlighting its possible imperfections which could perhaps be improved through further structural developments. In order to be expressive, the article follows through the journey of two nominal patients who turned up in the emergency department of the hospital. The importance of this topic is expressed by the fitful judgment of the emergency attendance. Emergency service had already existed in the United States, only later then did the one-entrance service system start to develop Hungary. In some places this system has been working well for decades, but for instance at the University of Szeged – due to the uncertain judgment of the system – the construction is just being finalized, right at the time when such studies are published that question the reason of existence of the emergency departments – at least in their actual form

    Speech Synthesis from Text and Ultrasound Tongue Image-based Articulatory Input

    Get PDF
    Articulatory information has been shown to be effective in improving the performance of HMM-based and DNN-based text-to-speech synthesis. Speech synthesis research focuses traditionally on text-to-speech conversion, when the input is text or an estimated linguistic representation, and the target is synthesized speech. However, a research field that has risen in the last decade is articulation-to-speech synthesis (with a target application of a Silent Speech Interface, SSI), when the goal is to synthesize speech from some representation of the movement of the articulatory organs. In this paper, we extend traditional (vocoder-based) DNN-TTS with articulatory input, estimated from ultrasound tongue images. We compare text-only, ultrasound-only, and combined inputs. Using data from eight speakers, we show that that the combined text and articulatory input can have advantages in limited-data scenarios, namely, it may increase the naturalness of synthesized speech compared to single text input. Besides, we analyze the ultrasound tongue recordings of several speakers, and show that misalignments in the ultrasound transducer positioning can have a negative effect on the final synthesis performance.Comment: accepted at SSW11 (11th Speech Synthesis Workshop

    Neural Speaker Embeddings for Ultrasound-based Silent Speech Interfaces

    Get PDF
    Articulatory-to-acoustic mapping seeks to reconstruct speech from a recording of the articulatory movements, for example, an ultrasound video. Just like speech signals, these recordings represent not only the linguistic content, but are also highly specific to the actual speaker. Hence, due to the lack of multi-speaker data sets, researchers have so far concentrated on speaker-dependent modeling. Here, we present multi-speaker experiments using the recently published TaL80 corpus. To model speaker characteristics, we adjusted the x-vector framework popular in speech processing to operate with ultrasound tongue videos. Next, we performed speaker recognition experiments using 50 speakers from the corpus. Then, we created speaker embedding vectors and evaluated them on the remaining speakers. Finally, we examined how the embedding vector influences the accuracy of our ultrasound-to-speech conversion network in a multi-speaker scenario. In the experiments we attained speaker recognition error rates below 3%, and we also found that the embedding vectors generalize nicely to unseen speakers. Our first attempt to apply them in a multi-speaker silent speech framework brought about a marginal reduction in the error rate of the spectral estimation step.Comment: 5 pages, 3 figures, 3 table
    corecore