301 research outputs found

    Surface electromyographic control of a novel phonemic interface for speech synthesis

    Full text link
    Many individuals with minimal movement capabilities use AAC to communicate. These individuals require both an interface with which to construct a message (e.g., a grid of letters) and an input modality with which to select targets. This study evaluated the interaction of two such systems: (a) an input modality using surface electromyography (sEMG) of spared facial musculature, and (b) an onscreen interface from which users select phonemic targets. These systems were evaluated in two experiments: (a) participants without motor impairments used the systems during a series of eight training sessions, and (b) one individual who uses AAC used the systems for two sessions. Both the phonemic interface and the electromyographic cursor show promise for future AAC applications.F31 DC014872 - NIDCD NIH HHS; R01 DC002852 - NIDCD NIH HHS; R01 DC007683 - NIDCD NIH HHS; T90 DA032484 - NIDA NIH HHShttps://www.ncbi.nlm.nih.gov/pubmed/?term=Surface+electromyographic+control+of+a+novel+phonemic+interface+for+speech+synthesishttps://www.ncbi.nlm.nih.gov/pubmed/?term=Surface+electromyographic+control+of+a+novel+phonemic+interface+for+speech+synthesisPublished versio

    Inspiratory oscillatory flow with a portable ventilator: a bench study

    Get PDF
    INTRODUCTION: We observed an oscillatory flow while ventilating critically ill patients with the Dräger Oxylog 3000™ transport ventilator during interhospital transfer. The phenomenon occurred in paediatric patients or in adult patients with severe airway obstruction ventilated in the pressure-regulated or pressure-controlled mode. As this had not been described previously, we conducted a bench study to investigate the phenomenon. METHODS: An Oxylog 3000™ intensive care unit ventilator and a Dräger Medical Evita-4 NeoFlow™ intensive care unit ventilator were connected to a Dräger Medical LS800™ lung simulator. Data were registered by a Datex-S5™ Monitor with a D-fend™ flow and pressure sensor, and were analysed with a laptop using S5-Collect™ software. Clinical conditions were simulated using various ventilatory modes, using various ventilator settings, using different filters and endotracheal tubes, and by changing the resistance and compliance. Data were recorded for 258 combinations of patient factors and respirator settings to detect thresholds for the occurrence of the phenomenon and methods to overcome it. RESULTS: Under conditions with high resistance in pressure-regulated ventilation with the Oxylog 3000™, an oscillatory flow during inspiration produced rapid changes of the airway pressure. The phenomenon resulted in a jerky inspiration with high peak airway pressures, higher than those set on the ventilator. Reducing the inspiratory flow velocity was effective to terminate the phenomenon, but resulted in reduced tidal volumes. CONCLUSION: Oscillatory flow with potentially harmful effects may occur during ventilation with the Dräger Oxylog 3000™, especially in conditions with high resistance such as small airways in children (endotracheal tube internal diameter <6 mm) or severe obstructive lung diseases or airway diseases in adult patients

    Intraspeaker Comparisons of Acoustic and Articulatory Variability in American English /r/ Productions

    Full text link
    The purpose of this report is to test the hypothesis that speakers utilize an acoustic, rather than articulatory, planning space for speech production. It has been well-documented that many speakers of American English use different tongue configurations to produce /r/ in different phonetic contexts. The acoustic planning hypothesis suggests that although the /r/ configuration varies widely in different contexts, the primary acoustic cue for /r/, a dip in the F3 trajectory, will be less variable due to tradeoffs in articulatory variability, or trading relations, that help maintain a relatively constant F3 trajectory across phonetic contexts. Acoustic data and EMMA articulatory data from seven speakers producing /r/ in different phonetic contexts were analyzed. Visual inspection of the EMMA data at the point of F3 minimum revealed that each speaker appeared to use at least two of three trading relation strategies that would be expected to reduce F3 variability. Articulatory covariance measures confirmed that all seven speakers utilized a trading relation between tongue back height and tongue back horizontal position, six speakers utilized a trading relation between tongue tip height and tongue back height, and the speaker who did not use this latter strategy instead utilized a trading relation between tongue tip height and tongue back horizontal position. Estimates of F3 variability with and without the articulatory covariances indicated that F3 would be much higher for all speakers if the articulatory covariances were not utilized. These conclusions were further supported by a comparison of measured F3 variability to F3 variabilities estimated from the pellet data with and without articulatory covariances. In all subjects, the actual F3 variance was significantly lower than the F3 variance estimated without articulatory covariances, further supporting the conclusion that the articulatory trading relations were being used to reduce F3 variability. Together, these results strongly suggest that the neural control mechanisms underlying speech production make elegant use of trading relations between articulators to maintain a relatively invariant acoustic trace for /r/ across phonetic contexts

    Intraspeaker Comparisons of Acoustic and Articulatory Variability in American English /r/ Productions

    Full text link
    The purpose of this report is to test the hypothesis that speakers utilize an acoustic, rather than articulatory, planning space for speech production. It has been well-documented that many speakers of American English use different tongue configurations to produce /r/ in different phonetic contexts. The acoustic planning hypothesis suggests that although the /r/ configuration varies widely in different contexts, the primary acoustic cue for /r/, a dip in the F3 trajectory, will be less variable due to tradeoffs in articulatory variability, or trading relations, that help maintain a relatively constant F3 trajectory across phonetic contexts. Acoustic data and EMMA articulatory data from seven speakers producing /r/ in different phonetic contexts were analyzed. Visual inspection of the EMMA data at the point of F3 minimum revealed that each speaker appeared to use at least two of three trading relation strategies that would be expected to reduce F3 variability. Articulatory covariance measures confirmed that all seven speakers utilized a trading relation between tongue back height and tongue back horizontal position, six speakers utilized a trading relation between tongue tip height and tongue back height, and the speaker who did not use this latter strategy instead utilized a trading relation between tongue tip height and tongue back horizontal position. Estimates of F3 variability with and without the articulatory covariances indicated that F3 would be much higher for all speakers if the articulatory covariances were not utilized. These conclusions were further supported by a comparison of measured F3 variability to F3 variabilities estimated from the pellet data with and without articulatory covariances. In all subjects, the actual F3 variance was significantly lower than the F3 variance estimated without articulatory covariances, further supporting the conclusion that the articulatory trading relations were being used to reduce F3 variability. Together, these results strongly suggest that the neural control mechanisms underlying speech production make elegant use of trading relations between articulators to maintain a relatively invariant acoustic trace for /r/ across phonetic contexts

    Classification of intended phoneme production from chronic intracortical microelectrode recordings in speech-motor cortex

    Get PDF
    This is the published version, also available here: http://dx.doi.org/10.3389/fnins.2011.00065.We conducted a neurophysiological study of attempted speech production in a paralyzed human volunteer using chronic microelectrode recordings. The volunteer suffers from locked-in syndrome leaving him in a state of near-total paralysis, though he maintains good cognition and sensation. In this study, we investigated the feasibility of supervised classification techniques for prediction of intended phoneme production in the absence of any overt movements including speech. Such classification or decoding ability has the potential to greatly improve the quality-of-life of many people who are otherwise unable to speak by providing a direct communicative link to the general community. We examined the performance of three classifiers on a multi-class discrimination problem in which the items were 38 American English phonemes including monophthong and diphthong vowels and consonants. The three classifiers differed in performance, but averaged between 16 and 21% overall accuracy (chance-level is 1/38 or 2.6%). Further, the distribution of phonemes classified statistically above chance was non-uniform though 20 of 38 phonemes were classified with statistical significance for all three classifiers. These preliminary results suggest supervised classification techniques are capable of performing large scale multi-class discrimination for attempted speech production and may provide the basis for future communication prostheses

    Articulatory Tradeoffs Reduce Acoustic Variability During American English /r/ Production

    Full text link
    Acoustic and articulatory recordings reveal that speakers utilize systematic articulatory tradeoffs to maintain acoustic stability when producing the phoneme /r/. Distinct articulator configurations used to produce /r/ in various phonetic contexts show systematic tradeoffs between the cross-sectional areas of different vocal tract sections. Analysis of acoustic and articulatory variabilities reveals that these tradeoffs act to reduce acoustic variability, thus allowing large contextual variations in vocal tract shape; these contextual variations in turn apparently reduce the amount of articulatory movement required. These findings contrast with the widely held view that speaking involves a canonical vocal tract shape target for each phoneme.National Institute on Deafness and Other Communication Disorders (1R29-DC02852-02, 5R01-DC01925-04, 1R03-C2576-0l); National Science Foundation (IRI-9310518

    LaDIVA: A neurocomputational model providing laryngeal motor control for speech acquisition and production

    Get PDF
    Many voice disorders are the result of intricate neural and/or biomechanical impairments that are poorly understood. The limited knowledge of their etiological and pathophysiological mechanisms hampers effective clinical management. Behavioral studies have been used concurrently with computational models to better understand typical and pathological laryngeal motor control. Thus far, however, a unified computational framework that quantitatively integrates physiologically relevant models of phonation with the neural control of speech has not been developed. Here, we introduce LaDIVA, a novel neurocomputational model with physiologically based laryngeal motor control. We combined the DIVA model (an established neural network model of speech motor control) with the extended body-cover model (a physics-based vocal fold model). The resulting integrated model, LaDIVA, was validated by comparing its model simulations with behavioral responses to perturbations of auditory vocal fundamental frequency (fo) feedback in adults with typical speech. LaDIVA demonstrated capability to simulate different modes of laryngeal motor control, ranging from short-term (i.e., reflexive) and long-term (i.e., adaptive) auditory feedback paradigms, to generating prosodic contours in speech. Simulations showed that LaDIVA’s laryngeal motor control displays properties of motor equivalence, i.e., LaDIVA could robustly generate compensatory responses to reflexive vocal fo perturbations with varying initial laryngeal muscle activation levels leading to the same output. The model can also generate prosodic contours for studying laryngeal motor control in running speech. LaDIVA can expand the understanding of the physiology of human phonation to enable, for the first time, the investigation of causal effects of neural motor control in the fine structure of the vocal signal.Fil: Weerathunge, Hasini R.. Boston University; Estados UnidosFil: Alzamendi, Gabriel Alejandro. Universidad Nacional de Entre Ríos. Instituto de Investigación y Desarrollo en Bioingeniería y Bioinformática - Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación y Desarrollo en Bioingeniería y Bioinformática; ArgentinaFil: Cler, Gabriel J.. University of Washington; Estados UnidosFil: Guenther, Frank H.. Boston University; Estados UnidosFil: Stepp, Cara E.. Boston University; Estados UnidosFil: Zañartu, Matías. Universidad Técnica Federico Santa María; Chil

    Modified Extended BDF Time-Integration Methods, Applied to Circuit Equations

    Full text link

    A Wireless Brain-Machine Interface for Real-Time Speech Synthesis

    Get PDF
    This is the published version, also available here: http://dx.doi.org/10.1371/journal.pone.0008218.Background Brain-machine interfaces (BMIs) involving electrodes implanted into the human cerebral cortex have recently been developed in an attempt to restore function to profoundly paralyzed individuals. Current BMIs for restoring communication can provide important capabilities via a typing process, but unfortunately they are only capable of slow communication rates. In the current study we use a novel approach to speech restoration in which we decode continuous auditory parameters for a real-time speech synthesizer from neuronal activity in motor cortex during attempted speech. Methodology/Principal Findings Neural signals recorded by a Neurotrophic Electrode implanted in a speech-related region of the left precentral gyrus of a human volunteer suffering from locked-in syndrome, characterized by near-total paralysis with spared cognition, were transmitted wirelessly across the scalp and used to drive a speech synthesizer. A Kalman filter-based decoder translated the neural signals generated during attempted speech into continuous parameters for controlling a synthesizer that provided immediate (within 50 ms) auditory feedback of the decoded sound. Accuracy of the volunteer's vowel productions with the synthesizer improved quickly with practice, with a 25% improvement in average hit rate (from 45% to 70%) and 46% decrease in average endpoint error from the first to the last block of a three-vowel task. Conclusions/Significance Our results support the feasibility of neural prostheses that may have the potential to provide near-conversational synthetic speech output for individuals with severely impaired speech motor control. They also provide an initial glimpse into the functional properties of neurons in speech motor cortical areas
    • …
    corecore