301 research outputs found
Surface electromyographic control of a novel phonemic interface for speech synthesis
Many individuals with minimal movement capabilities use AAC to communicate. These individuals require both an interface with which to construct a message (e.g., a grid of letters) and an input modality with which to select targets. This study evaluated the interaction of two such systems: (a) an input modality using surface electromyography (sEMG) of spared facial musculature, and (b) an onscreen interface from which users select phonemic targets. These systems were evaluated in two experiments: (a) participants without motor impairments used the systems during a series of eight training sessions, and (b) one individual who uses AAC used the systems for two sessions. Both the phonemic interface and the electromyographic cursor show promise for future AAC applications.F31 DC014872 - NIDCD NIH HHS; R01 DC002852 - NIDCD NIH HHS; R01 DC007683 - NIDCD NIH HHS; T90 DA032484 - NIDA NIH HHShttps://www.ncbi.nlm.nih.gov/pubmed/?term=Surface+electromyographic+control+of+a+novel+phonemic+interface+for+speech+synthesishttps://www.ncbi.nlm.nih.gov/pubmed/?term=Surface+electromyographic+control+of+a+novel+phonemic+interface+for+speech+synthesisPublished versio
Inspiratory oscillatory flow with a portable ventilator: a bench study
INTRODUCTION: We observed an oscillatory flow while ventilating critically ill patients with the Dräger Oxylog 3000™ transport ventilator during interhospital transfer. The phenomenon occurred in paediatric patients or in adult patients with severe airway obstruction ventilated in the pressure-regulated or pressure-controlled mode. As this had not been described previously, we conducted a bench study to investigate the phenomenon. METHODS: An Oxylog 3000™ intensive care unit ventilator and a Dräger Medical Evita-4 NeoFlow™ intensive care unit ventilator were connected to a Dräger Medical LS800™ lung simulator. Data were registered by a Datex-S5™ Monitor with a D-fend™ flow and pressure sensor, and were analysed with a laptop using S5-Collect™ software. Clinical conditions were simulated using various ventilatory modes, using various ventilator settings, using different filters and endotracheal tubes, and by changing the resistance and compliance. Data were recorded for 258 combinations of patient factors and respirator settings to detect thresholds for the occurrence of the phenomenon and methods to overcome it. RESULTS: Under conditions with high resistance in pressure-regulated ventilation with the Oxylog 3000™, an oscillatory flow during inspiration produced rapid changes of the airway pressure. The phenomenon resulted in a jerky inspiration with high peak airway pressures, higher than those set on the ventilator. Reducing the inspiratory flow velocity was effective to terminate the phenomenon, but resulted in reduced tidal volumes. CONCLUSION: Oscillatory flow with potentially harmful effects may occur during ventilation with the Dräger Oxylog 3000™, especially in conditions with high resistance such as small airways in children (endotracheal tube internal diameter <6 mm) or severe obstructive lung diseases or airway diseases in adult patients
Intraspeaker Comparisons of Acoustic and Articulatory Variability in American English /r/ Productions
The purpose of this report is to test the hypothesis that speakers utilize an acoustic, rather than articulatory, planning space for speech production. It has been well-documented that many speakers of American English use different tongue configurations to produce /r/ in different phonetic contexts. The acoustic planning hypothesis suggests that although the /r/ configuration varies widely in different contexts, the primary acoustic cue for /r/, a dip in the F3 trajectory, will be less variable due to tradeoffs in articulatory variability, or trading relations, that help maintain a relatively constant F3 trajectory across phonetic contexts. Acoustic data and EMMA articulatory data from seven speakers producing /r/ in different phonetic contexts were analyzed. Visual inspection of the EMMA data at the point of F3 minimum revealed that each speaker appeared to use at least two of three trading relation strategies that would be expected to reduce F3 variability. Articulatory covariance measures confirmed that all seven speakers utilized a trading relation between tongue back height and tongue back horizontal position, six speakers utilized a trading relation between tongue tip height and tongue back height, and the speaker who did not use this latter strategy instead utilized a trading relation between tongue tip height and tongue back horizontal position. Estimates of F3 variability with and without the articulatory covariances indicated that F3 would be much higher for all speakers if the articulatory covariances were not utilized. These conclusions were further supported by a comparison of measured F3 variability to F3 variabilities estimated from the pellet data with and without articulatory covariances. In all subjects, the actual F3 variance was significantly lower than the F3 variance estimated without articulatory covariances, further supporting the conclusion that the articulatory trading relations were being used to reduce F3 variability. Together, these results strongly suggest that the neural control mechanisms underlying speech production make elegant use of trading relations between articulators to maintain a relatively invariant acoustic trace for /r/ across phonetic contexts
Intraspeaker Comparisons of Acoustic and Articulatory Variability in American English /r/ Productions
The purpose of this report is to test the hypothesis that speakers utilize an acoustic, rather than articulatory, planning space for speech production. It has been well-documented that many speakers of American English use different tongue configurations to produce /r/ in different phonetic contexts. The acoustic planning hypothesis suggests that although the /r/ configuration varies widely in different contexts, the primary acoustic cue for /r/, a dip in the F3 trajectory, will be less variable due to tradeoffs in articulatory variability, or trading relations, that help maintain a relatively constant F3 trajectory across phonetic contexts. Acoustic data and EMMA articulatory data from seven speakers producing /r/ in different phonetic contexts were analyzed. Visual inspection of the EMMA data at the point of F3 minimum revealed that each speaker appeared to use at least two of three trading relation strategies that would be expected to reduce F3 variability. Articulatory covariance measures confirmed that all seven speakers utilized a trading relation between tongue back height and tongue back horizontal position, six speakers utilized a trading relation between tongue tip height and tongue back height, and the speaker who did not use this latter strategy instead utilized a trading relation between tongue tip height and tongue back horizontal position. Estimates of F3 variability with and without the articulatory covariances indicated that F3 would be much higher for all speakers if the articulatory covariances were not utilized. These conclusions were further supported by a comparison of measured F3 variability to F3 variabilities estimated from the pellet data with and without articulatory covariances. In all subjects, the actual F3 variance was significantly lower than the F3 variance estimated without articulatory covariances, further supporting the conclusion that the articulatory trading relations were being used to reduce F3 variability. Together, these results strongly suggest that the neural control mechanisms underlying speech production make elegant use of trading relations between articulators to maintain a relatively invariant acoustic trace for /r/ across phonetic contexts
Classification of intended phoneme production from chronic intracortical microelectrode recordings in speech-motor cortex
This is the published version, also available here: http://dx.doi.org/10.3389/fnins.2011.00065.We conducted a neurophysiological study of attempted speech production in a paralyzed human volunteer using chronic microelectrode recordings. The volunteer suffers from locked-in syndrome leaving him in a state of near-total paralysis, though he maintains good cognition and sensation. In this study, we investigated the feasibility of supervised classification techniques for prediction of intended phoneme production in the absence of any overt movements including speech. Such classification or decoding ability has the potential to greatly improve the quality-of-life of many people who are otherwise unable to speak by providing a direct communicative link to the general community. We examined the performance of three classifiers on a multi-class discrimination problem in which the items were 38 American English phonemes including monophthong and diphthong vowels and consonants. The three classifiers differed in performance, but averaged between 16 and 21% overall accuracy (chance-level is 1/38 or 2.6%). Further, the distribution of phonemes classified statistically above chance was non-uniform though 20 of 38 phonemes were classified with statistical significance for all three classifiers. These preliminary results suggest supervised classification techniques are capable of performing large scale multi-class discrimination for attempted speech production and may provide the basis for future communication prostheses
Articulatory Tradeoffs Reduce Acoustic Variability During American English /r/ Production
Acoustic and articulatory recordings reveal that speakers utilize systematic articulatory tradeoffs to maintain acoustic stability when producing the phoneme /r/. Distinct articulator configurations used to produce /r/ in various phonetic contexts show systematic tradeoffs between the cross-sectional areas of different vocal tract sections. Analysis of acoustic and articulatory variabilities reveals that these tradeoffs act to reduce acoustic variability, thus allowing large contextual variations in vocal tract shape; these contextual variations in turn apparently reduce the amount of articulatory movement required. These findings contrast with the widely held view that speaking involves a canonical vocal tract shape target for each phoneme.National Institute on Deafness and Other Communication Disorders (1R29-DC02852-02, 5R01-DC01925-04, 1R03-C2576-0l); National Science Foundation (IRI-9310518
LaDIVA: A neurocomputational model providing laryngeal motor control for speech acquisition and production
Many voice disorders are the result of intricate neural and/or biomechanical impairments that are poorly understood. The limited knowledge of their etiological and pathophysiological mechanisms hampers effective clinical management. Behavioral studies have been used concurrently with computational models to better understand typical and pathological laryngeal motor control. Thus far, however, a unified computational framework that quantitatively integrates physiologically relevant models of phonation with the neural control of speech has not been developed. Here, we introduce LaDIVA, a novel neurocomputational model with physiologically based laryngeal motor control. We combined the DIVA model (an established neural network model of speech motor control) with the extended body-cover model (a physics-based vocal fold model). The resulting integrated model, LaDIVA, was validated by comparing its model simulations with behavioral responses to perturbations of auditory vocal fundamental frequency (fo) feedback in adults with typical speech. LaDIVA demonstrated capability to simulate different modes of laryngeal motor control, ranging from short-term (i.e., reflexive) and long-term (i.e., adaptive) auditory feedback paradigms, to generating prosodic contours in speech. Simulations showed that LaDIVA’s laryngeal motor control displays properties of motor equivalence, i.e., LaDIVA could robustly generate compensatory responses to reflexive vocal fo perturbations with varying initial laryngeal muscle activation levels leading to the same output. The model can also generate prosodic contours for studying laryngeal motor control in running speech. LaDIVA can expand the understanding of the physiology of human phonation to enable, for the first time, the investigation of causal effects of neural motor control in the fine structure of the vocal signal.Fil: Weerathunge, Hasini R.. Boston University; Estados UnidosFil: Alzamendi, Gabriel Alejandro. Universidad Nacional de Entre RÃos. Instituto de Investigación y Desarrollo en BioingenierÃa y Bioinformática - Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - Santa Fe. Instituto de Investigación y Desarrollo en BioingenierÃa y Bioinformática; ArgentinaFil: Cler, Gabriel J.. University of Washington; Estados UnidosFil: Guenther, Frank H.. Boston University; Estados UnidosFil: Stepp, Cara E.. Boston University; Estados UnidosFil: Zañartu, MatÃas. Universidad Técnica Federico Santa MarÃa; Chil
Recommended from our members
The helicase Ded1p controls use of near-cognate translation initiation codons in 5' UTRs.
The conserved and essential DEAD-box RNA helicase Ded1p from yeast and its mammalian orthologue DDX3 are critical for the initiation of translation1. Mutations in DDX3 are linked to tumorigenesis2-4 and intellectual disability5, and the enzyme is targeted by a range of viruses6. How Ded1p and its orthologues engage RNAs during the initiation of translation is unknown. Here we show, by integrating transcriptome-wide analyses of translation, RNA structure and Ded1p-RNA binding, that the effects of Ded1p on the initiation of translation are connected to near-cognate initiation codons in 5' untranslated regions. Ded1p associates with the translation pre-initiation complex at the mRNA entry channel and repressing the activity of Ded1p leads to the accumulation of RNA structure in 5' untranslated regions, the initiation of translation from near-cognate start codons immediately upstream of these structures and decreased protein synthesis from the corresponding main open reading frames. The data reveal a program for the regulation of translation that links Ded1p, the activation of near-cognate start codons and mRNA structure. This program has a role in meiosis, in which a marked decrease in the levels of Ded1p is accompanied by the activation of the alternative translation initiation sites that are seen when the activity of Ded1p is repressed. Our observations indicate that Ded1p affects translation initiation by controlling the use of near-cognate initiation codons that are proximal to mRNA structure in 5' untranslated regions
A Wireless Brain-Machine Interface for Real-Time Speech Synthesis
This is the published version, also available here: http://dx.doi.org/10.1371/journal.pone.0008218.Background
Brain-machine interfaces (BMIs) involving electrodes implanted into the human cerebral cortex have recently been developed in an attempt to restore function to profoundly paralyzed individuals. Current BMIs for restoring communication can provide important capabilities via a typing process, but unfortunately they are only capable of slow communication rates. In the current study we use a novel approach to speech restoration in which we decode continuous auditory parameters for a real-time speech synthesizer from neuronal activity in motor cortex during attempted speech.
Methodology/Principal Findings
Neural signals recorded by a Neurotrophic Electrode implanted in a speech-related region of the left precentral gyrus of a human volunteer suffering from locked-in syndrome, characterized by near-total paralysis with spared cognition, were transmitted wirelessly across the scalp and used to drive a speech synthesizer. A Kalman filter-based decoder translated the neural signals generated during attempted speech into continuous parameters for controlling a synthesizer that provided immediate (within 50 ms) auditory feedback of the decoded sound. Accuracy of the volunteer's vowel productions with the synthesizer improved quickly with practice, with a 25% improvement in average hit rate (from 45% to 70%) and 46% decrease in average endpoint error from the first to the last block of a three-vowel task.
Conclusions/Significance
Our results support the feasibility of neural prostheses that may have the potential to provide near-conversational synthetic speech output for individuals with severely impaired speech motor control. They also provide an initial glimpse into the functional properties of neurons in speech motor cortical areas
- …