Search CORE

11 research outputs found

Diphthong Synthesis using the Three-Dimensional Dynamic Digital Waveguide Mesh

Author: Gully Amelia J
Publication venue: University of York
Publication date: 01/09/2017
Field of study

The human voice is a complex and nuanced instrument, and despite many years of research, no system is yet capable of producing natural-sounding synthetic speech. This affects intelligibility for some groups of listeners, in applications such as automated announcements and screen readers. Furthermore, those who require a computer to speak - due to surgery or a degenerative disease - are limited to unnatural-sounding voices that lack expressive control and may not match the user's gender, age or accent. It is evident that natural, personalised and controllable synthetic speech systems are required. A three-dimensional digital waveguide model of the vocal tract, based on magnetic resonance imaging data, is proposed here in order to address these issues. The model uses a heterogeneous digital waveguide mesh method to represent the vocal tract airway and surrounding tissues, facilitating dynamic movement and hence speech output. The accuracy of the method is validated by comparison with audio recordings of natural speech, and perceptual tests are performed which confirm that the proposed model sounds significantly more natural than simpler digital waveguide mesh vocal tract models. Control of such a model is also considered, and a proof-of-concept study is presented using a deep neural network to control the parameters of a two-dimensional vocal tract model, resulting in intelligible speech output and paving the way for extension of the control system to the proposed three-dimensional vocal tract model. Future improvements to the system are also discussed in detail. This project considers both the naturalness and control issues associated with synthetic speech and therefore represents a significant step towards improved synthetic speech for use across society

White Rose E-theses Online

Diphthong Synthesis Using the Dynamic 3D Digital Waveguide Mesh

Author: Daffern Helena
Gully Amelia Jane
Murphy Damian Thomas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2018
Field of study

Articulatory speech synthesis has the potential to offer more natural sounding synthetic speech than established concatenative or parametric synthesis methods. Time-domain acoustic models are particularly suited to the dynamic nature of the speech signal, and recent work has demonstrated the potential of dynamic vocal tract models that accurately reproduce the vocal tract geometry. This paper presents a dynamic 3D digital waveguide mesh (DWM) vocal tract model, capable of movement to produce diphthongs. The technique is compared to existing dynamic 2D and static 3D DWM models, for both monophthongs and diphthongs. The results indicate that the proposed model provides improved formant accuracy over existing DWM vocal tract models. Furthermore, the computational requirements of the proposed method are significantly lower than those of comparable dynamic simulation techniques. This paper represents another step toward a fully functional articulatory vocal tract model which will lead to more natural speech synthesis systems for use across society

Crossref

White Rose Research Online

Silent speech: restoring the power of speech to people whose larynx has been removed

Author: Gilbert James M.
González López José Andrés
Green Phil D.
Gully Amelia
Murphy Damian
Publication venue
Publication date: 29/10/2018
Field of study

Every year, some 17,500 people in Europe and North America lose the power of speech after undergoing a laryngectomy, normally as a treatment for throat cancer. Several research groups have recently demonstrated that it is possible to restore speech to these people by using machine learning to learn the transformation from articulator movement to sound. In our project articulator movement is captured by a technique developed by our collaborators at Hull University called Permanent Magnet Articulography (PMA), which senses the changes of magnetic field caused by movements of small magnets attached to the lips and tongue. This solution, however, requires synchronous PMA-and-audio recordings for learning the transformation and, hence, it cannot be applied to people who have already lost their voice. Here we propose to investigate a variant of this technique in which the PMA data are used to drive an articulatory synthesiser, which generates speech acoustics by simulating the airflow through a computational model of the vocal tract. The project goals, participants, current status, and achievements of the project are discussed below.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

Crossref

Repositorio Institucional Universidad de Málaga

Modeling Voiced Stop Consonants Using the 3D Dynamic Digital Waveguide Mesh Vocal Tract Model

Author: Gully Amelia Jane
Tucker Benjamin
Publication venue: Australasian Speech Science and Technology Association Inc.
Publication date: 06/08/2019
Field of study

White Rose Research Online

Multidimensional signals and analytic flexibility: Estimating degrees of freedom in human speech analyses

Author: Coretta Stefano
Gully Amelia
Hughes Vincent
Kettig Thomas
Publication venue
Publication date
Field of study

White Rose Research Online

Effects of formant settings and channel mismatch on semi-automatic systems in forensic voice comparison

Author: Foulkes Paul
French John Peter
Gully Amelia Jane
Harrison Philip Thomas
Hughes Vincent
Publication venue
Publication date: 01/01/2020
Field of study

White Rose Research Online

Articulatory Text-to-Speech Synthesis Using the Digital Waveguide Mesh Driven by a Deep Neural Network

Author: Gully Amelia Jane
Hashimoto Kei
Murphy Damian Thomas
Nankaku Yoshihiko
Tokuda Keiichi
Yoshimura Takenori
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2017
Field of study

Following recent advances in direct modeling of the speech waveform using a deep neural network, we propose a novel method that directly estimates a physical model of the vocal tract from the speech waveform, rather than magnetic resonance imaging data. This provides a clear relationship between the model and the size and shape of the vocal tract, offering considerable flexibility in terms of speech characteristics such as age and gender. Initial tests indicate that despite a highly simplified physical model, intelligible synthesized speech is obtained. This illustrates the potential of the combined technique for the control of physical models in general, and hence the generation of more natural-sounding synthetic speech

Crossref

White Rose Research Online

Forensic voice comparison using long-term acoustic measures of voice quality

Author: Cardoso Amanda
Foulkes Paul
French John Peter
Gully Amelia Jane
Harrison Philip Thomas
Hughes Vincent
Publication venue
Publication date
Field of study

White Rose Research Online

Remdesivir Inhibits SARS-CoV-2 in Human Lung Cells and Chimeric SARS-CoV Expressing the SARS-CoV-2 RNA Polymerase in Mice

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the causative agent of the novel viral disease COVID-19. With no approved therapies, this pandemic illustrates the urgent need for broad-spectrum antiviral countermeasures against SARS-CoV-2 and future emerging CoVs. We report that remdesivir (RDV) potently inhibits SARS-CoV-2 replication in human lung cells and primary human airway epithelial cultures (EC50 = 0.01 μM). Weaker activity is observed in Vero E6 cells (EC50 = 1.65 μM) because of their low capacity to metabolize RDV. To rapidly evaluate in vivo efficacy, we engineered a chimeric SARS-CoV encoding the viral target of RDV, the RNA-dependent RNA polymerase of SARS-CoV-2. In mice infected with the chimeric virus, therapeutic RDV administration diminishes lung viral load and improves pulmonary function compared with vehicle-treated animals. These data demonstrate that RDV is potently active against SARS-CoV-2 in vitro and in vivo, supporting its further clinical testing for treatment of COVID-19

Carolina Digital Repository

Multidimensional signals and analytic flexibility: Estimating degrees of freedom in human speech analyses

Author: Ahn Byron
Al-Hoorie Ali H.
Al-Tamimi Jalal
Alotaibi Najd E.
AlShakhori Mohammed K.
Altmiller Ruth M.
Arantes Pablo
Athanasopoulou Angeliki
Baese-Berk Melissa M.
Bailey George
Baira A Sangma Cheman
Beier Eleonora J.
Benavides Gabriela M.
Benker Nicole
BensonMeyer Emelia P.
Benway Nina R.
Berry Grant M.
Bing Liwen
Bjorndahl Christina
Bolyanatz Mariska
Braver Aaron
Brown Alicia M.
Brown Violet A.
Brugos Alejna
Buchanan Erin M.
Butlin Tanna
Buxo-Lugo Andres
Caillol Coline
Cangemi Francesco
Carignan Christopher
Carraturo Sita
Casillas Joseph V
Caudrelier Tiphaine
Chodroff Eleanor
Cohn Michelle
Coretta Stefano
Cronenberg Johanna
Crouzet Olivier
Dagar Erica L.
Dawson Charlotte
Diantoro Carissa A.
Dokovova Marie
Drake Shiloh
Du Fengting
Dubuis Margaux
Dueme Florent
Durward Matthew
Egurtzegi Ander
Elsherif Mahmoud M.
Esser Janina
Ferragne Emmanuel
Ferreira Fernanda
Fink Lauren K.
Finley Sara
Foster Kurtis
Foulkes Paul
Franke Michael
Franzke Rosa
Frazer-McKee Gabriel
Fromont Robert
Garcia Christina
Geller Jason
Grasso Camille L.
Greca Pia
Grice Martine
Grose-Hodge Magdalena S.
Gully Amelia J.
Halfacre Caitlin
Hauser Ivy
Hay Jen
Haywood Robert
Hellmuth Sam
Hilger Allison I.
Holliday Nicole
Hoogland Damar
Huang Yaqian
Hughes Vincent
Icardo Isasa Ane
Ilchovska Zlatomira G.
Jeon Hae-Sung
Jones Jacq
Junges Magat N.
Kaefer Stephanie
Kaland Constantijn
Kelley Matthew C.
Kelly Niamh E.
Kettig Thomas
Khattab Ghada
Koolen Ruud
Krahmer Emiel
Krajewska Dorota
Krug Andreas
Kumar Abhilasha A.
Lander Anna
Lentz Tomas O.
Li Wanyin
Li Yanyu
Lialiou Maria
Lima Jr. Ronaldo M.
Lo Justin J. H.
Lopez Otero Julio Cesar
Mackay Bradley
MacLeod Bethany
Mallard Mel
McConnellogue Carol-Ann Mary
Moroz George
Murali Mridhula
Nalborczyk Ladislas
Nenadic Filip
Nieder Jessica
Nikolic Dusan
Nogueira Francisco G. S.
Offerman Heather M.
Passoni Elisa
Pelissier Maud
Perry Scott J.
Pfiffner Alexandra M.
Proctor Michael
Rhodes Ryan
Rodriguez Nicole
Roepke Elizabeth
Roer Jan P.
Roessig Simon
Roettger Timo B.
Sbacco Lucia
Scarborough Rebecca
Schaeffler Felix
Schleef Erik
Schmitz Dominic
Shiryaev Alexander
Soskuthy Marton
Spaniol Malin
Stanley Joseph
Strickler Alyssa
Tavano Alessandro
Tomaschek Fabian
Tucker Benjamin V.
Turnbull Rory
Ugwuanyi Kingsley O.
Urrestarazu-Porta Inigo
van de Vijver Ruben
Van Engen Kristin J.
van Miltenburg Emiel
Wang Bruce
Warner Natasha
Wehrle Simon
Westerbeek Hans
Wiener Seth
Winters Stephen
Wong Sidney G.-J.
Wood Anna
Wottawa Jane
Xu Chenzi
Zarate-Sanchez German
Zellou Georgia
Zhang Cong
Zhu Jian
Publication venue: 'Center for Open Science'
Publication date: 01/01/2023
Field of study

Recent empirical studies have highlighted the large degree of analytic flexibility in data analysis which can lead to substantially different conclusions based on the same data set. Thus, researchers have expressed their concerns that these researcher degrees of freedom might facilitate bias and can lead to claims that do not stand the test of time. Even greater flexibility is to be expected in fields in which the primary data lend themselves to a variety of possible operationalizations. The multidimensional, temporally extended nature of speech constitutes an ideal testing ground for assessing the variability in analytic approaches, which derives not only from aspects of statistical modeling, but also from decisions regarding the quantification of the measured behavior. In the present study, we gave the same speech production data set to 46 teams of researchers and asked them to answer the same research question, resulting insubstantial variability in reported effect sizes and their interpretation. Using Bayesian meta-analytic tools, we further find little to no evidence that the observed variability can be explained by analysts’ prior beliefs, expertise or the perceived quality of their analyses. In light of this idiosyncratic variability, we recommend that researchers more transparently share details of their analysis, strengthen the link between theoretical construct and quantitative system and calibrate their (un)certainty in their conclusions

CLoK

University of Birmingham Research Portal

Archivo Digital para la Docencia y la Investigación

Edinburgh Research Explorer

Queen Margaret University eResearch

White Rose Research Online

Utrecht University Repository

Oskar Bordeaux

Tilburg University Repository

Leicester Research Archive