17,372 research outputs found
Towards Personalized Synthesized Voices for Individuals with Vocal Disabilities: Voice Banking and Reconstruction
When individuals lose the ability to produce their own speech, due to degenerative diseases such as motor neurone disease (MND) or Parkinson’s, they lose not only a functional means of communication but also a display of their individual and group identity. In order to build personalized synthetic voices, attempts have been made to capture the voice before it is lost, using a process known as voice banking. But, for some patients, the speech deterioration frequently coincides or quickly follows diagnosis. Using HMM-based speech synthesis, it is now possible to build personalized synthetic voices with minimal data recordings and even disordered speech. The power of this approach is that it is possible to use the patient’s recordings to adapt existing voice models pre-trained on many speakers. When the speech has begun to deteriorate, the adapted voice model can be further modified in order to compensate for the disordered characteristics found in the patient’s speech. The University of Edinburgh has initiated a project for voice banking and reconstruction based on this speech synthesis technology. At the current stage of the project, more than fifteen patients with MND have already been recorded and five of them have been delivered a reconstructed voice. In this paper, we present an overview of the project as well as subjective assessments of the reconstructed voices and feedback from patients and their families
Exploiting Nonlinear Recurrence and Fractal Scaling Properties for Voice Disorder Detection
Background: Voice disorders affect patients profoundly, and acoustic tools can potentially measure voice function objectively. Disordered sustained vowels exhibit wide-ranging phenomena, from nearly periodic to highly complex, aperiodic vibrations, and increased "breathiness". Modelling and surrogate data studies have shown significant nonlinear and non-Gaussian random properties in these sounds. Nonetheless, existing tools are limited to analysing voices displaying near periodicity, and do not account for this inherent biophysical nonlinearity and non-Gaussian randomness, often using linear signal processing methods insensitive to these properties. They do not directly measure the two main biophysical symptoms of disorder: complex nonlinear aperiodicity, and turbulent, aeroacoustic, non-Gaussian randomness. Often these tools cannot be applied to more severe disordered voices, limiting their clinical usefulness.

Methods: This paper introduces two new tools to speech analysis: recurrence and fractal scaling, which overcome the range limitations of existing tools by addressing directly these two symptoms of disorder, together reproducing a "hoarseness" diagram. A simple bootstrapped classifier then uses these two features to distinguish normal from disordered voices.

Results: On a large database of subjects with a wide variety of voice disorders, these new techniques can distinguish normal from disordered cases, using quadratic discriminant analysis, to overall correct classification performance of 91.8% plus or minus 2.0%. The true positive classification performance is 95.4% plus or minus 3.2%, and the true negative performance is 91.5% plus or minus 2.3% (95% confidence). This is shown to outperform all combinations of the most popular classical tools.

Conclusions: Given the very large number of arbitrary parameters and computational complexity of existing techniques, these new techniques are far simpler and yet achieve clinically useful classification performance using only a basic classification technique. They do so by exploiting the inherent nonlinearity and turbulent randomness in disordered voice signals. They are widely applicable to the whole range of disordered voice phenomena by design. These new measures could therefore be used for a variety of practical clinical purposes.

Simulating dysarthric speech for training data augmentation in clinical speech applications
Training machine learning algorithms for speech applications requires large,
labeled training data sets. This is problematic for clinical applications where
obtaining such data is prohibitively expensive because of privacy concerns or
lack of access. As a result, clinical speech applications are typically
developed using small data sets with only tens of speakers. In this paper, we
propose a method for simulating training data for clinical applications by
transforming healthy speech to dysarthric speech using adversarial training. We
evaluate the efficacy of our approach using both objective and subjective
criteria. We present the transformed samples to five experienced
speech-language pathologists (SLPs) and ask them to identify the samples as
healthy or dysarthric. The results reveal that the SLPs identify the
transformed speech as dysarthric 65% of the time. In a pilot classification
experiment, we show that by using the simulated speech samples to balance an
existing dataset, the classification accuracy improves by about 10% after data
augmentation.Comment: Will appear in Proc. of ICASSP 201
Exploring auditory-motor interactions in normal and disordered speech
Auditory feedback plays an important role in speech motor learning and in the online correction of speech movements. Speakers can detect and correct auditory feedback errors at the segmental and suprasegmental levels during ongoing speech. The frontal brain regions that contribute to these corrective movements have also been shown to be more active during speech in persons who stutter (PWS) compared to fluent speakers. Further, various types of altered auditory feedback can temporarily improve the fluency of PWS, suggesting that atypical auditory-motor interactions during speech may contribute to stuttering disfluencies. To investigate this possibility, we have developed and improved Audapter, a software that enables configurable dynamic perturbation of the spatial and temporal content of the speech auditory signal in real time. Using Audapter, we have measured the compensatory responses of PWS to static and dynamic perturbations of the formant content of auditory feedback and compared these responses with those from matched fluent controls. Our findings indicate deficient utilization of auditory feedback by PWS for short-latency online control of the spatial and temporal parameters of articulation during vowel production and during running speech. These findings provide further evidence that stuttering is associated with aberrant auditory-motor integration during speech.Published versio
Brittany Bernal - Sensorimotor Adaptation of Vowel Production in Stop Consonant Contexts
The purpose of this research is to measure the compensatory and adaptive articulatory response to shifted formants in auditory feedback to compare the resulting amount of sensorimotor learning that takes place in speakers upon saying the words /pep/ and /tet/. These words were chosen in order to analyze the coarticulatory effects of voiceless consonants /p/ and /t/ on sensorimotor adaptation of the vowel /e/. The formant perturbations were done using the Audapt software, which takes an input speech sample and plays it back to the speaker in real-time via headphones. Formants are high-energy acoustic resonance patterns measured in hertz that reflect positions of articulators during the production of speech sounds. The two lowest frequency formants (F1 and F2) can uniquely distinguish among the vowels of American English. For this experiment, Audapt shifted F1 down and F2 up, and those who adapt were expected to shift in the opposite direction of the perturbation. The formant patterns and vowel boundaries were analyzed using TF32 and S+ software, which led to conclusions about the adaptive responses. Manipulating auditory feedback by shifting formant values is hypothesized to elicit sensorimotor adaptation, a form of short-term motor learning. The amount of adaptation is expected to be greater for the word /pep/ rather than /tet/ because there is less competition for articulatory placement of the tongue during production of bilabial consonants. This methodology could be further developed to help those with motor speech disorders remedy their speech errors with much less conscious effort than traditional therapy techniques.https://epublications.marquette.edu/mcnair_2013/1008/thumbnail.jp
Analysis of Vocal Disorders in a Feature Space
This paper provides a way to classify vocal disorders for clinical
applications. This goal is achieved by means of geometric signal separation in
a feature space. Typical quantities from chaos theory (like entropy,
correlation dimension and first lyapunov exponent) and some conventional ones
(like autocorrelation and spectral factor) are analysed and evaluated, in order
to provide entries for the feature vectors. A way of quantifying the amount of
disorder is proposed by means of an healthy index that measures the distance of
a voice sample from the centre of mass of both healthy and sick clusters in the
feature space. A successful application of the geometrical signal separation is
reported, concerning distinction between normal and disordered phonation.Comment: 12 pages, 3 figures, accepted for publication in Medical Engineering
& Physic
Historical Analyses of Disordered Handwriting
Handwritten texts carry significant information, extending beyond the meaning of their words. Modern neurology, for example, benefits from the interpretation of the graphic features of writing and drawing for the diagnosis and monitoring of diseases and disorders. This article examines how handwriting analysis can be used, and has been used historically, as a methodological tool for the assessment of medical conditions and how this enhances our understanding of historical contexts of writing. We analyze handwritten material, writing tests and letters, from patients in an early 20th-century psychiatric hospital in southern Germany (Irsee/Kaufbeuren). In this institution, early psychiatrists assessed handwriting features, providing us novel insights into the earliest practices of psychiatric handwriting analysis, which can be connected to Berkenkotter’s research on medical admission records. We finally consider the degree to which historical handwriting bears semiotic potential to explain the psychological state and personality of a writer, and how future research in written communication should approach these sources
Recommended from our members
Health-related quality of life in people with aphasia: Implications for fluency disorders quality of life research
Abstract
It is increasingly important that clinicians address the health-related quality of life (HRQOL) of adults with communication disorders in clinical practice. The overall aim of this paper is to draw conclusion about the suitability of the Short Form 36 Health Survey for the communication disorders of aphasia and stuttering. This study reports on the impact of post-stroke aphasia on 30 Australian older adults’ HRQOL. It also comments on the capacity of the SF-36 to measure HRQOL in this population, specifically whether it is sensitive to the three known determinants of post-stroke HRQOL – emotional, physical and social functioning. Comparisons with other data are made to assist interpretation of the SF-36 subscale scores: with 75 older adults with no history of neurological conditions; and with data from the 1995 National Health Survey data. The main findings are: (1) adults with post-stroke aphasia have similar HRQOL to their peers on six subscales, but significantly lower Role emotional and Mental health HRQOL; (2) a substantial number of aphasic adults reported depressive mood; and (3) aphasic adults with depressive mood have significantly worse HRQOL on six subscales than aphasic adults without depressive mood, but similar Role emotional and Body pain HRQOL. In conclusion, stroke and aphasia have minimal impact on older adults’ HRQOL as measured by the SF-36, which conflicts with an established evidence base of the negative consequences of aphasia on life. Thus, the SF-36 is not advisable for use with aphasic adults. Implications of these findings for aphasia and stuttering are discussed.
Educational objectives: The reader will be able to: (a) describe the impact of aphasia and depressive mood on quality of life; (b) compare the impact of aphasia on the quality of life of adults to adults who do not have aphasia; (c) describe the similarities and differences between quality of life of adults with aphasia and adults who stutter; and (d) describe the strengths and limitations of the SF-36 as a measure of quality of life in adults who stutter versus adults with aphasia
- …