305 research outputs found

    Development of speech prostheses: current status and recent advances

    Get PDF
    This is an Accepted Manuscript of an article published by Taylor & Francis in Expert Review of Medical Devices on September, 2010, available online: http://www.tandfonline.com/10.1586/erd.10.34.Brain–computer interfaces (BCIs) have been developed over the past decade to restore communication to persons with severe paralysis. In the most severe cases of paralysis, known as locked-in syndrome, patients retain cognition and sensation, but are capable of only slight voluntary eye movements. For these patients, no standard communication method is available, although some can use BCIs to communicate by selecting letters or words on a computer. Recent research has sought to improve on existing techniques by using BCIs to create a direct prediction of speech utterances rather than to simply control a spelling device. Such methods are the first steps towards speech prostheses as they are intended to entirely replace the vocal apparatus of paralyzed users. This article outlines many well known methods for restoration of communication by BCI and illustrates the difference between spelling devices and direct speech prediction or speech prosthesis

    On the design of visual feedback for the rehabilitation of hearing-impaired speech

    Get PDF

    An experimental DSP-based tactile hearing aid : a feasibility study

    Get PDF

    Vocal tract acoustic measurements and their application to articulatory modelling

    Get PDF
    In the field of speech research it is agreed that more real data is required to improve the articulatory modelling of the vocal tract. Acoustic techniques may be used to acquire vocal tract data. The advance of digital signal processing has allowed the development of new experimental techniques that allow fast and efficient measurements. DSP based measurement systems were set up, and acoustic impedance and transfer function measurements were performed on a wide variety of subjects in DCU’s semianechoic chamber. The measurement systems are compact and reproducible. The variation of the wall vibration load was investigated in a wide range of human subjects. The investigation was prompted by the question: Is the wall vibration load important in the study and implementation of vocal tract and articulatory models? The results point to the possible need in acoustic to articulatory inversion, of adapting the reference model to specific subjects by separately estimating the wall impedance load

    The "Tiepstem" : an experimental Dutch keyboard-to-speech system for the speech impaired

    Get PDF
    An experimental Dutch keyboard-to-speech system has been developed to explor the possibilities and limitations of Dutch speech synthesis in a communication aid for the speech impaired. The system uses diphones and a formant synthesizer chip for speech synthesis. Input to the system is in pseudo-phonetic notation. Intonation contours using a declination line and various rises and falls are generated starting from an input consisting of punctuation and accent marks. The hardware design has resulted in a small, portable and battery-powered device. A short evaluation with users has been carried out, which has shown possibilities for such a device but has also indicated some problems with the current pseudo-phonetic input

    Analysis on Using Synthesized Singing Techniques in Assistive Interfaces for Visually Impaired to Study Music

    Get PDF
    Tactile and auditory senses are the basic types of methods that visually impaired people sense the world. Their interaction with assistive technologies also focuses mainly on tactile and auditory interfaces. This research paper discuss about the validity of using most appropriate singing synthesizing techniques as a mediator in assistive technologies specifically built to address their music learning needs engaged with music scores and lyrics. Music scores with notations and lyrics are considered as the main mediators in musical communication channel which lies between a composer and a performer. Visually impaired music lovers have less opportunity to access this main mediator since most of them are in visual format. If we consider a music score, the vocal performer’s melody is married to all the pleasant sound producible in the form of singing. Singing best fits for a format in temporal domain compared to a tactile format in spatial domain. Therefore, conversion of existing visual format to a singing output will be the most appropriate nonlossy transition as proved by the initial research on adaptive music score trainer for visually impaired [1]. In order to extend the paths of this initial research, this study seek on existing singing synthesizing techniques and researches on auditory interfaces

    Synthetic voice design and implementation.

    Get PDF
    The limitations of speech output technology emphasise the need for exploratory psychological research to maximise the effectiveness of speech as a display medium in human-computer interaction. Stage 1 of this study reviewed speech implementation research, focusing on general issues for tasks, users and environments. An analysis of design issues was conducted, related to the differing methodologies for synthesised and digitised message production. A selection of ergonomic guidelines were developed to enhance effective speech interface design. Stage 2 addressed the negative reactions of users to synthetic speech in spite of elegant dialogue structure and appropriate functional assignment. Synthetic speech interfaces have been consistently rejected by their users in a wide variety of application domains because of their poor quality. Indeed the literature repeatedly emphasises quality as being the most important contributor to implementation acceptance. In order to investigate this, a converging operations approach was adopted. This consisted of a series of five experiments (and associated pilot studies) which homed in on the specific characteristics of synthetic speech that determine the listeners varying perceptions of its qualities, and how these might be manipulated to improve its aesthetics. A flexible and reliable ratings interface was designed to display DECtalk speech variations and record listeners perceptions. In experiment one, 40 participants used this to evaluate synthetic speech variations on a wide range of perceptual scales. Factor analysis revealed two main factors: "listenability" accounting for 44.7% of the variance and correlating with the DECtalk "smoothness" parameter to . 57 (p<0.005) and "richness" to . 53 (p<0.005); "assurance" accounting for 12.6% of the variance and correlating with "average pitch" to . 42 (p<0.005) and "head size" to. 42 (p<0.005). Complimentary experiments were then required in order to address appropriate voice design for enhanced listenability and assurance perceptions. With a standard male voice set, 20 participants rated enhanced smoothness and attenuated richness as contributing significantly to speech listenability (p<0.001). Experiment three using a female voice set yielded comparable results, suggesting that further refinements of the technique were necessary in order to develop an effective methodology for speech quality optimization. At this stage it became essential to focus directly on the parameter modifications that are associated with the the aesthetically pleasing characteristics of synthetic speech. If a reliable technique could be developed to enhance perceived speech quality, then synthesis systems based on the commonly used DECtalk model might assume some of their considerable yet unfulfilled potential. In experiment four, 20 subjects rated a wide range of voices modified across the two main parameters associated with perceived listenability, smoothness and richness. The results clearly revealed a linear relationship between enhanced smoothness and attenuated richness and significant improvements in perceived listenability (p<0.001 in both cases). Planned comparisons conducted were between the different levels of the parameters and revealed significant listenability enhancements as smoothness was increased, and a similar pattern as richness decreased. Statistical analysis also revealed a significant interaction between the two parameters (p<0.001) and a more comprehensive picture was constructed. In order to expand the focus of and enhance the generality of the research, it was now necessary to assess the effects of synthetic speech modifications whilst subjects were undertaking a more realistic task. Passively rating the voices independent of processing for meaning is arguably an artificial task which rarely, if ever, would occur in 'real-world' settings. In order to investigate perceived quality in a more realistic task scenario, experiment five introduced two levels of information processing load. The purpose of this experiment was firstly to see if a comprehension load modified the pattern of listenability enhancements, and secondly to see if that pattern differed between high and and low load. Techniques for introducing cognitive load were investigated and comprehension load was selected as the most appropriate method in this case. A pilot study distinguished two levels of comprehension load from a set of 150 true/false sentences and these were recorded across the full range of parameter modifications. Twenty subjects then rated the voices using the established listenability scales as before but also performing the additional task of processing each spoken stimuli for meaning and determining the authenticity of the statements. Results indicated that listenability enhancements did indeed occur at both levels of processing although at the higher level variations in the pattern occured. A significant difference was revealed between optimal parameter modifications for conditions of high and low cognitive load (p<0.05). The results showed that subjects perceived the synthetic voices in the high cognitive load condition to be significantly less listenable than those same voices in the low cognitive load condition. The analysis also revealed that this effect was independent of the number of errors made. This result may be of general value because conclusions drawn from this findings are independent of any particular parameter modifications that may be exclusively available to DECtalk users. Overall, the study presents a detailed analysis of the research domain combined with a systematic experimental program of synthetic speech quality assessment. The experiments reported establish a reliable and replicable procedure for optimising the aesthetically pleasing characteristics of DECtalk speech, but the implications of the research extend beyond the boundaries of a particular synthesiser. Results from the experimental program lead to a number of conclusions, the most salient being that not only does the synthetic speech designer have to overcome the general rejection of synthetic voices based on their poor quality by sophisticated customisation of synthetic voice parameters, but that he or she needs to take into account the cognitive load of the task being undertaken. The interaction between cognitive load and optimal settings for synthesis requires direct consideration if synthetic speech systems are going to realise and maximise their potential in human computer interaction

    The Development of Synthetic Speech Aids for Patients With Acquired Disability

    Get PDF
    Patients suffering from a variety of speech disorders can benefit from synthetic speech. This study concentrates on the dysarthric patients with acquired speech loss as these patients have intact intellect and are more likely to benefit from synthetic speech. The physical skills of these patients vary enormously and their needs and situations are different. The main part of this work is concerned with the design, development and evaluation of a range of speech aids to meet these varying needs and skills. Three methods of speech synthesis are used and their performance has been investigated by using a Diagnostic Rhyme Test to measure the intelligibility of individual words. The results of this trial showed Adaptive Differential Pulse Code Modulation (ADPCM) to be more intelligible than Linear Predictive Coding (LPC), both these methods being more intelligible than constructive synthesis. A further trial was conducted to measure the speech quality of phrases produced by the synthesisers. This showed listeners preferred listening to phrases constructed of LPC words than to phrases generated using Phoneme based synthesisers. Phrases with mixed LPC and constructed words were preferred to phrases of constructed words. The devices that were developed use different methods of synthesis and the choice of method was guided by these trials. The Pocket Speech Aid is a rapid access limited vocabulary communication aid which uses ADPCM synthesis. Direct selection is the method used to give users access to eight phrases. The Pocket Speech Aid has been very successful in practice. When used as a telephone aid eight out of ten patients increased their communication ability and when used as a conversation prompter ten out of fourteen patients were able to steer the direction of real time conversations. This device has generated a great deal of interest from other centres and the demand for the device which is currently being manufactured confirms that it has a role to play in assisting those with communication difficulties. The Macleod Unit was named after a remarkable patient suffering from Motor Neurone Disease who realised his speech would soon be lost and had the foresight to select a vocabulary and record the words on a cassette recorder. His 625 word vocabulary was transferred to the speech aid which uses an encoding method of word selection. Clinical feedback showed the device to be of benefit for this highly motivated individual but was less successful for other patients in this group who found the cognitive effort to select codes too great. An unlimited vocabulary device based on the commercially available VOTRAX which uses constructive synthesis was developed but this device was rejected because of the robotic sounding voice. A further unlimited vocabulary device prototype, the Uvocom, was designed to improve the speech quality and to investigate if there is a need for an unlimited vocabulary. The Uvocom uses a core vocabulary of 1000 LPC words and uses Phoneme back-up for words not stored in the core vocabulary. Trials with the Uvocom have indicated that quality speech in an unlimited vocabulary device is likely to benefit a small number of patients who have the physical skills to operate such a device. Finally, some indication is given of the directions in which future work could progress based on the proven success of the Pocket Speech Aid
    • 

    corecore