Search CORE

17,372 research outputs found

Towards Personalized Synthesized Voices for Individuals with Vocal Disabilities: Voice Banking and Reconstruction

Author: King Simon
Veaux Christophe
Yamagishi Junichi
Publication venue
Publication date: 01/01/2013
Field of study

When individuals lose the ability to produce their own speech, due to degenerative diseases such as motor neurone disease (MND) or Parkinson’s, they lose not only a functional means of communication but also a display of their individual and group identity. In order to build personalized synthetic voices, attempts have been made to capture the voice before it is lost, using a process known as voice banking. But, for some patients, the speech deterioration frequently coincides or quickly follows diagnosis. Using HMM-based speech synthesis, it is now possible to build personalized synthetic voices with minimal data recordings and even disordered speech. The power of this approach is that it is possible to use the patient’s recordings to adapt existing voice models pre-trained on many speakers. When the speech has begun to deteriorate, the adapted voice model can be further modified in order to compensate for the disordered characteristics found in the patient’s speech. The University of Edinburgh has initiated a project for voice banking and reconstruction based on this speech synthesis technology. At the current stage of the project, more than fifteen patients with MND have already been recorded and five of them have been delivered a reconstructed voice. In this paper, we present an overview of the project as well as subjective assessments of the reconstructed voices and feedback from patients and their families

CiteSeerX

Edinburgh Research Explorer

Exploiting Nonlinear Recurrence and Fractal Scaling Properties for Voice Disorder Detection

Author: Declan Ae Costello
Declan Ae Costello
Irene M
Irene M Moroz
Max A Little
Patrick E Mcsharry
Stephen J Roberts
Publication venue
Publication date: 01/01/2007
Field of study

Background: Voice disorders affect patients profoundly, and acoustic tools can potentially measure voice function objectively. Disordered sustained vowels exhibit wide-ranging phenomena, from nearly periodic to highly complex, aperiodic vibrations, and increased "breathiness". Modelling and surrogate data studies have shown significant nonlinear and non-Gaussian random properties in these sounds. Nonetheless, existing tools are limited to analysing voices displaying near periodicity, and do not account for this inherent biophysical nonlinearity and non-Gaussian randomness, often using linear signal processing methods insensitive to these properties. They do not directly measure the two main biophysical symptoms of disorder: complex nonlinear aperiodicity, and turbulent, aeroacoustic, non-Gaussian randomness. Often these tools cannot be applied to more severe disordered voices, limiting their clinical usefulness.

Methods: This paper introduces two new tools to speech analysis: recurrence and fractal scaling, which overcome the range limitations of existing tools by addressing directly these two symptoms of disorder, together reproducing a "hoarseness" diagram. A simple bootstrapped classifier then uses these two features to distinguish normal from disordered voices.

Results: On a large database of subjects with a wide variety of voice disorders, these new techniques can distinguish normal from disordered cases, using quadratic discriminant analysis, to overall correct classification performance of 91.8% plus or minus 2.0%. The true positive classification performance is 95.4% plus or minus 3.2%, and the true negative performance is 91.5% plus or minus 2.3% (95% confidence). This is shown to outperform all combinations of the most popular classical tools.

Conclusions: Given the very large number of arbitrary parameters and computational complexity of existing techniques, these new techniques are far simpler and yet achieve clinically useful classification performance using only a basic classification technique. They do so by exploiting the inherent nonlinearity and turbulent randomness in disordered voice signals. They are widely applicable to the whole range of disordered voice phenomena by design. These new measures could therefore be used for a variety of practical clinical purposes.&#xa

arXiv.org e-Print Archive

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Oxford University Research Archive

Nature Precedings

Simulating dysarthric speech for training data augmentation in clinical speech applications

Author: Berisha Visar
Jiao Yishan
Liss Julie
Tu Ming
Publication venue
Publication date: 26/04/2018
Field of study

Training machine learning algorithms for speech applications requires large, labeled training data sets. This is problematic for clinical applications where obtaining such data is prohibitively expensive because of privacy concerns or lack of access. As a result, clinical speech applications are typically developed using small data sets with only tens of speakers. In this paper, we propose a method for simulating training data for clinical applications by transforming healthy speech to dysarthric speech using adversarial training. We evaluate the efficacy of our approach using both objective and subjective criteria. We present the transformed samples to five experienced speech-language pathologists (SLPs) and ask them to identify the samples as healthy or dysarthric. The results reveal that the SLPs identify the transformed speech as dysarthric 65% of the time. In a pilot classification experiment, we show that by using the simulated speech samples to balance an existing dataset, the classification accuracy improves by about 10% after data augmentation.Comment: Will appear in Proc. of ICASSP 201

arXiv.org e-Print Archive

Crossref

Exploring auditory-motor interactions in normal and disordered speech

Author: Cai Shanqing
Guenther Frank H.
Tourville Jason A.
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/2013
Field of study

Auditory feedback plays an important role in speech motor learning and in the online correction of speech movements. Speakers can detect and correct auditory feedback errors at the segmental and suprasegmental levels during ongoing speech. The frontal brain regions that contribute to these corrective movements have also been shown to be more active during speech in persons who stutter (PWS) compared to fluent speakers. Further, various types of altered auditory feedback can temporarily improve the fluency of PWS, suggesting that atypical auditory-motor interactions during speech may contribute to stuttering disfluencies. To investigate this possibility, we have developed and improved Audapter, a software that enables configurable dynamic perturbation of the spatial and temporal content of the speech auditory signal in real time. Using Audapter, we have measured the compensatory responses of PWS to static and dynamic perturbations of the formant content of auditory feedback and compared these responses with those from matched fluent controls. Our findings indicate deficient utilization of auditory feedback by PWS for short-latency online control of the spatial and temporal parameters of articulation during vowel production and during running speech. These findings provide further evidence that stuttering is associated with aberrant auditory-motor integration during speech.Published versio

Crossref

Boston University Institutional Repository (OpenBU)

Brittany Bernal - Sensorimotor Adaptation of Vowel Production in Stop Consonant Contexts

Author: Bernal Brittany A.
Publication venue: e-Publications@Marquette
Publication date: 01/07/2013
Field of study

The purpose of this research is to measure the compensatory and adaptive articulatory response to shifted formants in auditory feedback to compare the resulting amount of sensorimotor learning that takes place in speakers upon saying the words /pep/ and /tet/. These words were chosen in order to analyze the coarticulatory effects of voiceless consonants /p/ and /t/ on sensorimotor adaptation of the vowel /e/. The formant perturbations were done using the Audapt software, which takes an input speech sample and plays it back to the speaker in real-time via headphones. Formants are high-energy acoustic resonance patterns measured in hertz that reflect positions of articulators during the production of speech sounds. The two lowest frequency formants (F1 and F2) can uniquely distinguish among the vowels of American English. For this experiment, Audapt shifted F1 down and F2 up, and those who adapt were expected to shift in the opposite direction of the perturbation. The formant patterns and vowel boundaries were analyzed using TF32 and S+ software, which led to conclusions about the adaptive responses. Manipulating auditory feedback by shifting formant values is hypothesized to elicit sensorimotor adaptation, a form of short-term motor learning. The amount of adaptation is expected to be greater for the word /pep/ rather than /tet/ because there is less competition for articulatory placement of the tongue during production of bilabial consonants. This methodology could be further developed to help those with motor speech disorders remedy their speech errors with much less conscious effort than traditional therapy techniques.https://epublications.marquette.edu/mcnair_2013/1008/thumbnail.jp

epublications@Marquette

Analysis of Vocal Disorders in a Feature Space

Author: Hegger Rainer
Kantz Holger
Manfredi Claudia
Matassini Lorenzo
Publication venue
Publication date: 01/01/2000
Field of study

This paper provides a way to classify vocal disorders for clinical applications. This goal is achieved by means of geometric signal separation in a feature space. Typical quantities from chaos theory (like entropy, correlation dimension and first lyapunov exponent) and some conventional ones (like autocorrelation and spectral factor) are analysed and evaluated, in order to provide entries for the feature vectors. A way of quantifying the amount of disorder is proposed by means of an healthy index that measures the distance of a voice sample from the centre of mass of both healthy and sick clusters in the feature space. A successful application of the geometrical signal separation is reported, concerning distinction between normal and disordered phonation.Comment: 12 pages, 3 figures, accepted for publication in Medical Engineering & Physic

arXiv.org e-Print Archive

MPG.PuRe

Historical Analyses of Disordered Handwriting

Author: Deborah Thorpe
Markus Schiegg
Publication venue: 'Modern Language Association'
Publication date: 01/01/2017
Field of study

Handwritten texts carry significant information, extending beyond the meaning of their words. Modern neurology, for example, benefits from the interpretation of the graphic features of writing and drawing for the diagnosis and monitoring of diseases and disorders. This article examines how handwriting analysis can be used, and has been used historically, as a methodological tool for the assessment of medical conditions and how this enhances our understanding of historical contexts of writing. We analyze handwritten material, writing tests and letters, from patients in an early 20th-century psychiatric hospital in southern Germany (Irsee/Kaufbeuren). In this institution, early psychiatrists assessed handwriting features, providing us novel insights into the earliest practices of psychiatric handwriting analysis, which can be connected to Berkenkotter’s research on medical admission records. We finally consider the degree to which historical handwriting bears semiotic potential to explain the psychological state and personality of a writer, and how future research in written communication should approach these sources

Humanities Commons

Recommended from our members

Health-related quality of life in people with aphasia: Implications for fluency disorders quality of life research

Author: Agrell
American Psychiatric Association
Appelros
Bays
Bell
Chia
Craig
Craig
Cruice
Cruice
Cruice
Cruice
Cruice
Dalton
Franic
Geyh
Hagen
Hampton
Hayes
Hilari
Hilari
Holland
Iverach
Jaracz
Jonkman
Jönsson
Kauhanen
Kauhanen
Kertesz
Kim
King
Klompas
Labi
Lai
Le Dorze
Leeds
Linda Worrall
Louise Hickson
Madden
Madeline Cruice
Mallinson
McDowell
Naess
Namasivayam
Niemi
Ross
Salter
Sanson-Fisher
Sarno
Schepers
Sheik
Simmons-Mackie
Stark
Townend
Ware
Ware
Williams
Yaruss
Yaruss
Zemva
Publication venue: 'Elsevier BV'
Publication date: 01/09/2010
Field of study

Abstract It is increasingly important that clinicians address the health-related quality of life (HRQOL) of adults with communication disorders in clinical practice. The overall aim of this paper is to draw conclusion about the suitability of the Short Form 36 Health Survey for the communication disorders of aphasia and stuttering. This study reports on the impact of post-stroke aphasia on 30 Australian older adults’ HRQOL. It also comments on the capacity of the SF-36 to measure HRQOL in this population, specifically whether it is sensitive to the three known determinants of post-stroke HRQOL – emotional, physical and social functioning. Comparisons with other data are made to assist interpretation of the SF-36 subscale scores: with 75 older adults with no history of neurological conditions; and with data from the 1995 National Health Survey data. The main findings are: (1) adults with post-stroke aphasia have similar HRQOL to their peers on six subscales, but significantly lower Role emotional and Mental health HRQOL; (2) a substantial number of aphasic adults reported depressive mood; and (3) aphasic adults with depressive mood have significantly worse HRQOL on six subscales than aphasic adults without depressive mood, but similar Role emotional and Body pain HRQOL. In conclusion, stroke and aphasia have minimal impact on older adults’ HRQOL as measured by the SF-36, which conflicts with an established evidence base of the negative consequences of aphasia on life. Thus, the SF-36 is not advisable for use with aphasic adults. Implications of these findings for aphasia and stuttering are discussed. Educational objectives: The reader will be able to: (a) describe the impact of aphasia and depressive mood on quality of life; (b) compare the impact of aphasia on the quality of life of adults to adults who do not have aphasia; (c) describe the similarities and differences between quality of life of adults with aphasia and adults who stutter; and (d) describe the strengths and limitations of the SF-36 as a measure of quality of life in adults who stutter versus adults with aphasia

City Research Online

Crossref

University of Queensland eSpace