24 research outputs found

    Retainer-Free Optopalatographic Device Design and Evaluation as a Feedback Tool in Post-Stroke Speech and Swallowing Therapy

    Get PDF
    Stroke is one of the leading causes of long-term motor disability, including oro-facial impairments which affect speech and swallowing. Over the last decades, rehabilitation programs have evolved from utilizing mainly compensatory measures to focusing on recovering lost function. In the continuing effort to improve recovery, the concept of biofeedback has increasingly been leveraged to enhance self-efficacy, motivation and engagement during training. Although both speech and swallowing disturbances resulting from oro-facial impairments are frequent sequelae of stroke, efforts to develop sensing technologies that provide comprehensive and quantitative feedback on articulator kinematics and kinetics, especially those of the tongue, and specifically during post-stroke speech and swallowing therapy have been sparse. To that end, such a sensing device needs to accurately capture intraoral tongue motion and contact with the hard palate, which can then be translated into an appropriate form of feedback, without affecting tongue motion itself and while still being light-weight and portable. This dissertation proposes the use of an intraoral sensing principle known as optopalatography to provide such feedback while also exploring the design of optopalatographic devices itself for use in dysphagia and dysarthria therapy. Additionally, it presents an alternative means of holding the device in place inside the oral cavity with a newly developed palatal adhesive instead of relying on dental retainers, which previously limited device usage to a single person. The evaluation was performed on the task of automatically classifying different functional tongue exercises from one another with application in dysphagia therapy, whereas a phoneme recognition task was conducted with application in dysarthria therapy. Results on the palatal adhesive suggest that it is indeed a valid alternative to dental retainers when device residence time inside the oral cavity is limited to several tens of minutes per session, which is the case for dysphagia and dysarthria therapy. Functional tongue exercises were classified with approximately 61 % accuracy across subjects, whereas for the phoneme recognition task, tense vowels had the highest recognition rate, followed by lax vowels and consonants. In summary, retainer-free optopalatography has the potential to become a viable method for providing real-time feedback on tongue movements inside the oral cavity, but still requires further improvements as outlined in the remarks on future development.:1 Introduction 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Goals and contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Scope and limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 Basics of post-stroke speech and swallowing therapy 2.1 Dysarthria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Dysphagia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3 Treatment rationale and potential of biofeedback . . . . . . . . . . . . . . . . . 13 2.4 Summary and conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3 Tongue motion sensing 3.1 Contact-based methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.1.1 Electropalatography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.1.2 Manometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.1.3 Capacitive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2 Non-contact based methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.2.1 Electromagnetic articulography . . . . . . . . . . . . . . . . . . . . . . . 23 3.2.2 Permanent magnetic articulography . . . . . . . . . . . . . . . . . . . . 24 3.2.3 Optopalatography (related work) . . . . . . . . . . . . . . . . . . . . . . 25 3.3 Electro-optical stomatography . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.4 Extraoral sensing techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.5 Summary, comparison and conclusion . . . . . . . . . . . . . . . . . . . . . . . 29 4 Fundamentals of optopalatography 4.1 Important radiometric quantities . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.1.1 Solid angle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.1.2 Radiant flux and radiant intensity . . . . . . . . . . . . . . . . . . . . . 33 4.1.3 Irradiance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.1.4 Radiance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.2 Sensing principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.2.1 Analytical models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.2.2 Monte Carlo ray tracing methods . . . . . . . . . . . . . . . . . . . . . . 37 4.2.3 Data-driven models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.2.4 Model comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.3 A priori device design consideration . . . . . . . . . . . . . . . . . . . . . . . . 41 4.3.1 Optoelectronic components . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.3.2 Additional electrical components and requirements . . . . . . . . . . . . 43 4.3.3 Intraoral sensor layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 5 Intraoral device anchorage 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 5.1.1 Mucoadhesion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 5.1.2 Considerations for the palatal adhesive . . . . . . . . . . . . . . . . . . . 48 5.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 5.2.1 Polymer selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 5.2.2 Fabrication method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 5.2.3 Formulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 5.2.4 PEO tablets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 5.2.5 Connection to the intraoral sensor’s encapsulation . . . . . . . . . . . . 50 5.2.6 Formulation evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 5.3.1 Initial formulation evaluation . . . . . . . . . . . . . . . . . . . . . . . . 54 5.3.2 Final OPG adhesive formulation . . . . . . . . . . . . . . . . . . . . . . 56 5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 6 Initial device design with application in dysphagia therapy 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 6.2 Optode and optical sensor selection . . . . . . . . . . . . . . . . . . . . . . . . . 60 6.2.1 Optode and optical sensor evaluation procedure . . . . . . . . . . . . . . 61 6.2.2 Selected optical sensor characterization . . . . . . . . . . . . . . . . . . 62 6.2.3 Mapping from counts to millimeter . . . . . . . . . . . . . . . . . . . . . 62 6.2.4 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 6.3 Device design and hardware implementation . . . . . . . . . . . . . . . . . . . . 64 6.3.1 Block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 6.3.2 Optode placement and circuit board dimensions . . . . . . . . . . . . . 64 6.3.3 Firmware description and measurement cycle . . . . . . . . . . . . . . . 66 6.3.4 Encapsulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 6.3.5 Fully assembled OPG device . . . . . . . . . . . . . . . . . . . . . . . . 67 6.4 Evaluation on the gesture recognition task . . . . . . . . . . . . . . . . . . . . . 69 6.4.1 Exercise selection, setup and recording . . . . . . . . . . . . . . . . . . . 69 6.4.2 Data corpus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 6.4.3 Sequence pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 6.4.4 Choice of classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 6.4.5 Training and evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 6.4.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 6.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 7 Improved device design with application in dysarthria therapy 7.1 Device design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 7.1.1 Design considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 7.1.2 General system overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 7.1.3 Intraoral sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 7.1.4 Receiver and controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 7.1.5 Multiplexer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 7.2 Hardware implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 7.2.1 Optode placement and circuit board layout . . . . . . . . . . . . . . . . 87 7.2.2 Encapsulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 7.3 Device characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 7.3.1 Photodiode transient response . . . . . . . . . . . . . . . . . . . . . . . 91 7.3.2 Current source and rise time . . . . . . . . . . . . . . . . . . . . . . . . 91 7.3.3 Multiplexer switching speed . . . . . . . . . . . . . . . . . . . . . . . . . 92 7.3.4 Measurement cycle and firmware implementation . . . . . . . . . . . . . 93 7.3.5 In vitro measurement accuracy . . . . . . . . . . . . . . . . . . . . . . . 95 7.3.6 Optode measurement stability . . . . . . . . . . . . . . . . . . . . . . . 96 7.4 Evaluation on the phoneme recognition task . . . . . . . . . . . . . . . . . . . . 98 7.4.1 Corpus selection and recording setup . . . . . . . . . . . . . . . . . . . . 98 7.4.2 Annotation and sensor data post-processing . . . . . . . . . . . . . . . . 98 7.4.3 Mapping from counts to millimeter . . . . . . . . . . . . . . . . . . . . . 99 7.4.4 Classifier and feature selection . . . . . . . . . . . . . . . . . . . . . . . 100 7.4.5 Evaluation paradigms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 7.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 7.5.1 Tongue distance curve prediction . . . . . . . . . . . . . . . . . . . . . . 105 7.5.2 Tongue contact patterns and contours . . . . . . . . . . . . . . . . . . . 105 7.5.3 Phoneme recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 7.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 8 Conclusion and future work 115 9 Appendix 9.1 Analytical light transport models . . . . . . . . . . . . . . . . . . . . . . . . . . 119 9.2 Meshed Monte Carlo method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 9.3 Laser safety . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 9.4 Current source modulation voltage . . . . . . . . . . . . . . . . . . . . . . . . . 123 9.5 Transimpedance amplifier’s frequency responses . . . . . . . . . . . . . . . . . . 123 9.6 Initial OPG device’s PCB layout and circuit diagrams . . . . . . . . . . . . . . 127 9.7 Improved OPG device’s PCB layout and circuit diagrams . . . . . . . . . . . . 129 9.8 Test station layout drawing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 Bibliography 152Der Schlaganfall ist eine der häufigsten Ursachen für motorische Langzeitbehinderungen, einschließlich solcher im Mund- und Gesichtsbereich, deren Folgen u.a. Sprech- und Schluckprobleme beinhalten, welche sich in den beiden Symptomen Dysarthrie und Dysphagie äußern. In den letzten Jahrzehnten haben sich Rehabilitationsprogramme für die Behandlung von motorisch ausgeprägten Schlaganfallsymptomatiken substantiell weiterentwickelt. So liegt nicht mehr die reine Kompensation von verlorengegangener motorischer Funktionalität im Vordergrund, sondern deren aktive Wiederherstellung. Dabei hat u.a. die Verwendung von sogenanntem Biofeedback vermehrt Einzug in die Therapie erhalten, um Motivation, Engagement und Selbstwahrnehmung von ansonsten unbewussten Bewegungsabläufen seitens der Patienten zu fördern. Obwohl jedoch Sprech- und Schluckstörungen eine der häufigsten Folgen eines Schlaganfalls darstellen, wird diese Tatsache nicht von der aktuellen Entwicklung neuer Geräte und Messmethoden für quantitatives und umfassendes Biofeedback reflektiert, insbesondere nicht für die explizite Erfassung intraoraler Zungenkinematik und -kinetik und für den Anwendungsfall in der Schlaganfalltherapie. Ein möglicher Grund dafür liegt in den sehr strikten Anforderungen an ein solche Messmethode: Sie muss neben Portabilität idealerweise sowohl den Kontakt zwischen der Zunge und dem Gaumen, als auch die dreidimensionale Bewegung der Zunge in der Mundhöhle erfassen, ohne dabei die Artikulation selbst zu beeinflussen. Um diesen Anforderungen gerecht zu werden, wird in dieser Dissertation das Messprinzip der Optopalatographie untersucht, mit dem Schwerpunkt auf der Anwendung in der Dysarthrie- und Dysphagietherapie. Dies beinhaltet auch die Entwicklung eines entsprechenden Gerätes sowie dessen Befestigungsmethode in der Mundhöhle über ein dediziertes Mundschleimhautadhäsiv. Letzteres umgeht das bisherige Problem der notwendigen Anpassung eines solchen intraoralen Gerätes an einen einzelnen Nutzer. Für die Anwendung in der Dysphagietherapie erfolgte die Evaluation anhand einer automatischen Erkennung von Mobilisationsübungen der Zunge, welche routinemäßig in der funktionalen Dysphagietherapie durchgeführt werden. Für die Anwendung in der Dysarthrietherapie wurde eine Lauterkennung durchgeführt. Die Resultate bezüglich der Verwendung des Mundschleimhautadhäsives suggerieren, dass dieses tatsächlich eine valide Alternative zu den bisher verwendeten Techniken zur Befestigung intraoraler Geräte in der Mundhöhle darstellt. Zungenmobilisationsübungen wurden über Probanden hinweg mit einer Rate von 61 % erkannt, wogegen in der Lauterkennung Langvokale die höchste Erkennungsrate erzielten, gefolgt von Kurzvokalen und Konsonanten. Zusammenfassend lässt sich konstatieren, dass das Prinzip der Optopalatographie eine ernstzunehmende Option für die intraorale Erfassung von Zungenbewegungen darstellt, wobei weitere Entwicklungsschritte notwendig sind, welche im Ausblick zusammengefasst sind.:1 Introduction 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Goals and contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Scope and limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 Basics of post-stroke speech and swallowing therapy 2.1 Dysarthria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Dysphagia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3 Treatment rationale and potential of biofeedback . . . . . . . . . . . . . . . . . 13 2.4 Summary and conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3 Tongue motion sensing 3.1 Contact-based methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.1.1 Electropalatography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.1.2 Manometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.1.3 Capacitive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2 Non-contact based methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.2.1 Electromagnetic articulography . . . . . . . . . . . . . . . . . . . . . . . 23 3.2.2 Permanent magnetic articulography . . . . . . . . . . . . . . . . . . . . 24 3.2.3 Optopalatography (related work) . . . . . . . . . . . . . . . . . . . . . . 25 3.3 Electro-optical stomatography . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.4 Extraoral sensing techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.5 Summary, comparison and conclusion . . . . . . . . . . . . . . . . . . . . . . . 29 4 Fundamentals of optopalatography 4.1 Important radiometric quantities . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.1.1 Solid angle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.1.2 Radiant flux and radiant intensity . . . . . . . . . . . . . . . . . . . . . 33 4.1.3 Irradiance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.1.4 Radiance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.2 Sensing principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.2.1 Analytical models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.2.2 Monte Carlo ray tracing methods . . . . . . . . . . . . . . . . . . . . . . 37 4.2.3 Data-driven models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.2.4 Model comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.3 A priori device design consideration . . . . . . . . . . . . . . . . . . . . . . . . 41 4.3.1 Optoelectronic components . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.3.2 Additional electrical components and requirements . . . . . . . . . . . . 43 4.3.3 Intraoral sensor layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 5 Intraoral device anchorage 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 5.1.1 Mucoadhesion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 5.1.2 Considerations for the palatal adhesive . . . . . . . . . . . . . . . . . . . 48 5.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 5.2.1 Polymer selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 5.2.2 Fabrication method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 5.2.3 Formulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 5.2.4 PEO tablets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 5.2.5 Connection to the intraoral sensor’s encapsulation . . . . . . . . . . . . 50 5.2.6 Formulation evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 5.3.1 Initial formulation evaluation . . . . . . . . . . . . . . . . . . . . . . . . 54 5.3.2 Final OPG adhesive formulation . . . . . . . . . . . . . . . . . . . . . . 56 5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 6 Initial device design with application in dysphagia therapy 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 6.2 Optode and optical sensor selection . . . . . . . . . . . . . . . . . . . . . . . . . 60 6.2.1 Optode and optical sensor evaluation procedure . . . . . . . . . . . . . . 61 6.2.2 Selected optical sensor characterization . . . . . . . . . . . . . . . . . . 62 6.2.3 Mapping from counts to millimeter . . . . . . . . . . . . . . . . . . . . . 62 6.2.4 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 6.3 Device design and hardware implementation . . . . . . . . . . . . . . . . . . . . 64 6.3.1 Block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 6.3.2 Optode placement and circuit board dimensions . . . . . . . . . . . . . 64 6.3.3 Firmware description and measurement cycle . . . . . . . . . . . . . . . 66 6.3.4 Encapsulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 6.3.5 Fully assembled OPG device . . . . . . . . . . . . . . . . . . . . . . . . 67 6.4 Evaluation on the gesture recognition task . . . . . . . . . . . . . . . . . . . . . 69 6.4.1 Exercise selection, setup and recording . . . . . . . . . . . . . . . . . . . 69 6.4.2 Data corpus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 6.4.3 Sequence pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 6.4.4 Choice of classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 6.4.5 Training and evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 6.4.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 6.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 7 Improved device design with application in dysarthria therapy 7.1 Device design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 7.1.1 Design considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 7.1.2 General system overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 7.1.3 Intraoral sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 7.1.4 Receiver and controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 7.1.5 Multiplexer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 7.2 Hardware implementation . . . . . . . . . . . . . . . . . . . . .

    A Silent-Speech Interface using Electro-Optical Stomatography

    Get PDF
    Sprachtechnologie ist eine große und wachsende Industrie, die das Leben von technologieinteressierten Nutzern auf zahlreichen Wegen bereichert. Viele potenzielle Nutzer werden jedoch ausgeschlossen: Nämlich alle Sprecher, die nur schwer oder sogar gar nicht Sprache produzieren können. Silent-Speech Interfaces bieten einen Weg, mit Maschinen durch ein bequemes sprachgesteuertes Interface zu kommunizieren ohne dafür akustische Sprache zu benötigen. Sie können außerdem prinzipiell eine Ersatzstimme stellen, indem sie die intendierten Äußerungen, die der Nutzer nur still artikuliert, künstlich synthetisieren. Diese Dissertation stellt ein neues Silent-Speech Interface vor, das auf einem neu entwickelten Messsystem namens Elektro-Optischer Stomatografie und einem neuartigen parametrischen Vokaltraktmodell basiert, das die Echtzeitsynthese von Sprache basierend auf den gemessenen Daten ermöglicht. Mit der Hardware wurden Studien zur Einzelworterkennung durchgeführt, die den Stand der Technik in der intra- und inter-individuellen Genauigkeit erreichten und übertrafen. Darüber hinaus wurde eine Studie abgeschlossen, in der die Hardware zur Steuerung des Vokaltraktmodells in einer direkten Artikulation-zu-Sprache-Synthese verwendet wurde. Während die Verständlichkeit der Synthese von Vokalen sehr hoch eingeschätzt wurde, ist die Verständlichkeit von Konsonanten und kontinuierlicher Sprache sehr schlecht. Vielversprechende Möglichkeiten zur Verbesserung des Systems werden im Ausblick diskutiert.:Statement of authorship iii Abstract v List of Figures vii List of Tables xi Acronyms xiii 1. Introduction 1 1.1. The concept of a Silent-Speech Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2. Structure of this work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Fundamentals of phonetics 7 2.1. Components of the human speech production system . . . . . . . . . . . . . . . . . . . 7 2.2. Vowel sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3. Consonantal sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.4. Acoustic properties of speech sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.5. Coarticulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.6. Phonotactics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.7. Summary and implications for the design of a Silent-Speech Interface (SSI) . . . . . . . 21 3. Articulatory data acquisition techniques in Silent-Speech Interfaces 25 3.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2. Scope of the literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.3. Video Recordings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.4. Ultrasonography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.5. Electromyography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.6. Permanent-Magnetic Articulography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.7. Electromagnetic Articulography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.8. Radio waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.9. Palatography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.10.Conclusion and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4. Electro-Optical Stomatography 55 4.1. Contact sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.2. Optical distance sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.3. Lip sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.4. Sensor Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.5. Control Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.6. Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 5. Articulation-to-Text 99 5.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.2. Command word recognition pilot study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.3. Command word recognition small-scale study . . . . . . . . . . . . . . . . . . . . . . . . 102 6. Articulation-to-Speech 109 6.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 6.2. Articulatory synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 6.3. The six point vocal tract model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 6.4. Objective evaluation of the vocal tract model . . . . . . . . . . . . . . . . . . . . . . . . 116 6.5. Perceptual evaluation of the vocal tract model . . . . . . . . . . . . . . . . . . . . . . . . 120 6.6. Direct synthesis using EOS to control the vocal tract model . . . . . . . . . . . . . . . . 125 6.7. Pitch and voicing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 7. Summary and outlook 145 7.1. Summary of the contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 7.2. Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 A. Overview of the International Phonetic Alphabet 151 B. Mathematical proofs and derivations 153 B.1. Combinatoric calculations illustrating the reduction of possible syllables using phonotactics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 B.2. Signal Averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 B.3. Effect of the contact sensor area on the conductance . . . . . . . . . . . . . . . . . . . . 155 B.4. Calculation of the forward current for the OP280V diode . . . . . . . . . . . . . . . . . . 155 C. Schematics and layouts 157 C.1. Schematics of the control unit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 C.2. Layout of the control unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 C.3. Bill of materials of the control unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 C.4. Schematics of the sensor unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 C.5. Layout of the sensor unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 C.6. Bill of materials of the sensor unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 D. Sensor unit assembly 169 E. Firmware flow and data protocol 177 F. Palate file format 181 G. Supplemental material regarding the vocal tract model 183 H. Articulation-to-Speech: Optimal hyperparameters 189 Bibliography 191Speech technology is a major and growing industry that enriches the lives of technologically-minded people in a number of ways. Many potential users are, however, excluded: Namely, all speakers who cannot easily or even at all produce speech. Silent-Speech Interfaces offer a way to communicate with a machine by a convenient speech recognition interface without the need for acoustic speech. They also can potentially provide a full replacement voice by synthesizing the intended utterances that are only silently articulated by the user. To that end, the speech movements need to be captured and mapped to either text or acoustic speech. This dissertation proposes a new Silent-Speech Interface based on a newly developed measurement technology called Electro-Optical Stomatography and a novel parametric vocal tract model to facilitate real-time speech synthesis based on the measured data. The hardware was used to conduct command word recognition studies reaching state-of-the-art intra- and inter-individual performance. Furthermore, a study on using the hardware to control the vocal tract model in a direct articulation-to-speech synthesis loop was also completed. While the intelligibility of synthesized vowels was high, the intelligibility of consonants and connected speech was quite poor. Promising ways to improve the system are discussed in the outlook.:Statement of authorship iii Abstract v List of Figures vii List of Tables xi Acronyms xiii 1. Introduction 1 1.1. The concept of a Silent-Speech Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2. Structure of this work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Fundamentals of phonetics 7 2.1. Components of the human speech production system . . . . . . . . . . . . . . . . . . . 7 2.2. Vowel sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3. Consonantal sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.4. Acoustic properties of speech sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.5. Coarticulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.6. Phonotactics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.7. Summary and implications for the design of a Silent-Speech Interface (SSI) . . . . . . . 21 3. Articulatory data acquisition techniques in Silent-Speech Interfaces 25 3.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2. Scope of the literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.3. Video Recordings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.4. Ultrasonography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.5. Electromyography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.6. Permanent-Magnetic Articulography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.7. Electromagnetic Articulography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.8. Radio waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.9. Palatography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.10.Conclusion and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4. Electro-Optical Stomatography 55 4.1. Contact sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.2. Optical distance sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.3. Lip sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.4. Sensor Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.5. Control Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.6. Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 5. Articulation-to-Text 99 5.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.2. Command word recognition pilot study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.3. Command word recognition small-scale study . . . . . . . . . . . . . . . . . . . . . . . . 102 6. Articulation-to-Speech 109 6.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 6.2. Articulatory synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 6.3. The six point vocal tract model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 6.4. Objective evaluation of the vocal tract model . . . . . . . . . . . . . . . . . . . . . . . . 116 6.5. Perceptual evaluation of the vocal tract model . . . . . . . . . . . . . . . . . . . . . . . . 120 6.6. Direct synthesis using EOS to control the vocal tract model . . . . . . . . . . . . . . . . 125 6.7. Pitch and voicing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 7. Summary and outlook 145 7.1. Summary of the contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 7.2. Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 A. Overview of the International Phonetic Alphabet 151 B. Mathematical proofs and derivations 153 B.1. Combinatoric calculations illustrating the reduction of possible syllables using phonotactics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 B.2. Signal Averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 B.3. Effect of the contact sensor area on the conductance . . . . . . . . . . . . . . . . . . . . 155 B.4. Calculation of the forward current for the OP280V diode . . . . . . . . . . . . . . . . . . 155 C. Schematics and layouts 157 C.1. Schematics of the control unit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 C.2. Layout of the control unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 C.3. Bill of materials of the control unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 C.4. Schematics of the sensor unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 C.5. Layout of the sensor unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 C.6. Bill of materials of the sensor unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 D. Sensor unit assembly 169 E. Firmware flow and data protocol 177 F. Palate file format 181 G. Supplemental material regarding the vocal tract model 183 H. Articulation-to-Speech: Optimal hyperparameters 189 Bibliography 19

    Silent Speech Interfaces for Speech Restoration: A Review

    Get PDF
    This work was supported in part by the Agencia Estatal de Investigacion (AEI) under Grant PID2019-108040RB-C22/AEI/10.13039/501100011033. The work of Jose A. Gonzalez-Lopez was supported in part by the Spanish Ministry of Science, Innovation and Universities under Juan de la Cierva-Incorporation Fellowship (IJCI-2017-32926).This review summarises the status of silent speech interface (SSI) research. SSIs rely on non-acoustic biosignals generated by the human body during speech production to enable communication whenever normal verbal communication is not possible or not desirable. In this review, we focus on the first case and present latest SSI research aimed at providing new alternative and augmentative communication methods for persons with severe speech disorders. SSIs can employ a variety of biosignals to enable silent communication, such as electrophysiological recordings of neural activity, electromyographic (EMG) recordings of vocal tract movements or the direct tracking of articulator movements using imaging techniques. Depending on the disorder, some sensing techniques may be better suited than others to capture speech-related information. For instance, EMG and imaging techniques are well suited for laryngectomised patients, whose vocal tract remains almost intact but are unable to speak after the removal of the vocal folds, but fail for severely paralysed individuals. From the biosignals, SSIs decode the intended message, using automatic speech recognition or speech synthesis algorithms. Despite considerable advances in recent years, most present-day SSIs have only been validated in laboratory settings for healthy users. Thus, as discussed in this paper, a number of challenges remain to be addressed in future research before SSIs can be promoted to real-world applications. If these issues can be addressed successfully, future SSIs will improve the lives of persons with severe speech impairments by restoring their communication capabilities.Agencia Estatal de Investigacion (AEI) PID2019-108040RB-C22/AEI/10.13039/501100011033Spanish Ministry of Science, Innovation and Universities under Juan de la Cierva-Incorporation Fellowship IJCI-2017-3292

    Interfaces de fala silenciosa multimodais para português europeu com base na articulação

    Get PDF
    Doutoramento conjunto MAPi em InformáticaThe concept of silent speech, when applied to Human-Computer Interaction (HCI), describes a system which allows for speech communication in the absence of an acoustic signal. By analyzing data gathered during different parts of the human speech production process, Silent Speech Interfaces (SSI) allow users with speech impairments to communicate with a system. SSI can also be used in the presence of environmental noise, and in situations in which privacy, confidentiality, or non-disturbance are important. Nonetheless, despite recent advances, performance and usability of Silent Speech systems still have much room for improvement. A better performance of such systems would enable their application in relevant areas, such as Ambient Assisted Living. Therefore, it is necessary to extend our understanding of the capabilities and limitations of silent speech modalities and to enhance their joint exploration. Thus, in this thesis, we have established several goals: (1) SSI language expansion to support European Portuguese; (2) overcome identified limitations of current SSI techniques to detect EP nasality (3) develop a Multimodal HCI approach for SSI based on non-invasive modalities; and (4) explore more direct measures in the Multimodal SSI for EP acquired from more invasive/obtrusive modalities, to be used as ground truth in articulation processes, enhancing our comprehension of other modalities. In order to achieve these goals and to support our research in this area, we have created a multimodal SSI framework that fosters leveraging modalities and combining information, supporting research in multimodal SSI. The proposed framework goes beyond the data acquisition process itself, including methods for online and offline synchronization, multimodal data processing, feature extraction, feature selection, analysis, classification and prototyping. Examples of applicability are provided for each stage of the framework. These include articulatory studies for HCI, the development of a multimodal SSI based on less invasive modalities and the use of ground truth information coming from more invasive/obtrusive modalities to overcome the limitations of other modalities. In the work here presented, we also apply existing methods in the area of SSI to EP for the first time, noting that nasal sounds may cause an inferior performance in some modalities. In this context, we propose a non-invasive solution for the detection of nasality based on a single Surface Electromyography sensor, conceivable of being included in a multimodal SSI.O conceito de fala silenciosa, quando aplicado a interação humano-computador, permite a comunicação na ausência de um sinal acústico. Através da análise de dados, recolhidos no processo de produção de fala humana, uma interface de fala silenciosa (referida como SSI, do inglês Silent Speech Interface) permite a utilizadores com deficiências ao nível da fala comunicar com um sistema. As SSI podem também ser usadas na presença de ruído ambiente, e em situações em que privacidade, confidencialidade, ou não perturbar, é importante. Contudo, apesar da evolução verificada recentemente, o desempenho e usabilidade de sistemas de fala silenciosa tem ainda uma grande margem de progressão. O aumento de desempenho destes sistemas possibilitaria assim a sua aplicação a áreas como Ambientes Assistidos. É desta forma fundamental alargar o nosso conhecimento sobre as capacidades e limitações das modalidades utilizadas para fala silenciosa e fomentar a sua exploração conjunta. Assim, foram estabelecidos vários objetivos para esta tese: (1) Expansão das linguagens suportadas por SSI com o Português Europeu; (2) Superar as limitações de técnicas de SSI atuais na deteção de nasalidade; (3) Desenvolver uma abordagem SSI multimodal para interação humano-computador, com base em modalidades não invasivas; (4) Explorar o uso de medidas diretas e complementares, adquiridas através de modalidades mais invasivas/intrusivas em configurações multimodais, que fornecem informação exata da articulação e permitem aumentar a nosso entendimento de outras modalidades. Para atingir os objetivos supramencionados e suportar a investigação nesta área procedeu-se à criação de uma plataforma SSI multimodal que potencia os meios para a exploração conjunta de modalidades. A plataforma proposta vai muito para além da simples aquisição de dados, incluindo também métodos para sincronização de modalidades, processamento de dados multimodais, extração e seleção de características, análise, classificação e prototipagem. Exemplos de aplicação para cada fase da plataforma incluem: estudos articulatórios para interação humano-computador, desenvolvimento de uma SSI multimodal com base em modalidades não invasivas, e o uso de informação exata com origem em modalidades invasivas/intrusivas para superar limitações de outras modalidades. No trabalho apresentado aplica-se ainda, pela primeira vez, métodos retirados do estado da arte ao Português Europeu, verificando-se que sons nasais podem causar um desempenho inferior de um sistema de fala silenciosa. Neste contexto, é proposta uma solução para a deteção de vogais nasais baseada num único sensor de eletromiografia, passível de ser integrada numa interface de fala silenciosa multimodal

    MEASURING PRE-SPEECH ARTICULATION

    Get PDF
    Abstract: What do speakers do when they start to talk? This thesis concentrates on the articulatory aspects of this problem, and offers methodological and experimental results on tongue movement, captured using Ultrasound Tongue Imaging (UTI). Speech initiation occurs at the start of every utterance. An understanding of the timing relationship between articulatory initiation (which occurs first) and acoustic initiation (that is, the start of audible speech) has implications for speech production theories, the methodological design and interpretation of speech production experiments, and clinical studies of speech production. Two novel automated techniques for detecting articulatory onsets in UTI data were developed based on Euclidean distance. The methods are verified against manually annotated data. The latter technique is based on a novel way of identifying the region of the tongue that is first to initiate movement. Data from three speech production experiments are analysed in this thesis. The first experiment is picture naming recorded with UTI and is used to explore behavioural variation at the beginning of an utterance, and to test and develop analysis tools for articulatory data. The second experiment also uses UTI recordings, but it is specifically designed to exclude any pre-speech movements of the articulators which are not directly related to the linguistic content of the utterance itself (that is, which are not expected to be present in every full repetition of the utterance), in order to study undisturbed speech initiation. The materials systematically varied the phonetic onsets of the monosyllabic target words, and the vowel nucleus. They also provided an acoustic measure of the duration of the syllable rhyme. Statistical models analysed the timing relationships of articulatory onset, and acoustic durations of the sound segments, and the acoustic duration of the rhyme. Finally, to test a discrepancy between the results of the second UTI experiment and findings in the literature, based on data recorded with Electromagnetic Articulography (EMA), a third experiment measured a single speaker using both methods and matched materials. Using the global Pixel Difference and Scanline-based Pixel Difference analysis methods developed and verified in the first half of the thesis, the main experimental findings were as follows. First, pre-utterance silent articulation is timed in inverse correlation with the acoustic duration of the onset consonant and in positive correlation with the acoustic rhyme of the first word. Because of the latter correlation, it should be considered part of the first word. Second, comparison of UTI and EMA failed to replicate the discrepancy. Instead, EMA was found to produce longer reaction times independent of utterance type.Keywords: Speech initiation, pre-speech articulation, delayed naming, ultrasound tongue imaging, electromagnetic articulography, automated methods

    Observations on the dynamic control of an articulatory synthesizer using speech production data

    Get PDF
    This dissertation explores the automatic generation of gestural score based control structures for a three-dimensional articulatory speech synthesizer. The gestural scores are optimized in an articulatory resynthesis paradigm using a dynamic programming algorithm and a cost function which measures the deviation from a gold standard in the form of natural speech production data. This data had been recorded using electromagnetic articulography, from the same speaker to which the synthesizer\u27s vocal tract model had previously been adapted. Future work to create an English voice for the synthesizer and integrate it into a text-to-speech platform is outlined.Die vorliegende Dissertation untersucht die automatische Erzeugung von gesturalpartiturbasierten Steuerdaten für ein dreidimensionales artikulatorisches Sprachsynthesesystem. Die gesturalen Partituren werden in einem artikulatorischen Resynthese-Paradigma mittels dynamischer Programmierung optimiert, unter Zuhilfenahme einer Kostenfunktion, die den Abstand zu einem "Gold Standard" in Form natürlicher Sprachproduktionsdaten mißt. Diese Daten waren mit elektromagnetischer Artikulographie am selben Sprecher aufgenommen worden, an den zuvor das Vokaltraktmodell des Synthesesystems angepaßt worden war. Weiterführende Forschung, eine englische Stimme für das Synthesesystem zu erzeugen und sie in eine Text-to-Speech-Plattform einzubetten, wird umrissen

    Temporal integration of loudness as a function of level

    Get PDF
    corecore