8 research outputs found

    Brain-to-text: Decoding spoken phrases from phone representations in the brain

    Get PDF
    It has long been speculated whether communication between humans and machines based on natural speech related cortical activity is possible. Over the past decade, studies have suggested that it is feasible to recognize isolated aspects of speech from neural signals, such as auditory features, phones or one of a few isolated words. However, until now it remained an unsolved challenge to decode continuously spoken speech from the neural substrate associated with speech and language processing. Here, we show for the first time that continuously spoken speech can be decoded into the expressed words from intracranial electrocorticographic (ECoG) recordings. Specifically, we implemented a system, which we call Brain-To-Text that models single phones, employs techniques from automatic speech recognition (ASR), and thereby transforms brain activity while speaking into the corresponding textual representation. Our results demonstrate that our system can achieve word error rates as low as 25% and phone error rates below 50%. Additionally, our approach contributes to the current understanding of the neural basis of continuous speech production by identifying those cortical regions that hold substantial information about individual phones. In conclusion, the Brain-To-Text system described in this paper represents an important step toward human-machine communication based on imagined speech

    Keyword Spotting Using Human Electrocorticographic Recordings

    Get PDF
    Neural keyword spotting could form the basis of a speech brain-computer-interface for menu-navigation if it can be done with low latency and high specificity comparable to the “wake-word” functionality of modern voice-activated AI assistant technologies. This study investigated neural keyword spotting using motor representations of speech via invasively-recorded electrocorticographic signals as a proof-of-concept. Neural matched filters were created from monosyllabic consonant-vowel utterances: one keyword utterance, and 11 similar non-keyword utterances. These filters were used in an analog to the acoustic keyword spotting problem, applied for the first time to neural data. The filter templates were cross-correlated with the neural signal, capturing temporal dynamics of neural activation across cortical sites. Neural vocal activity detection (VAD) was used to identify utterance times and a discriminative classifier was used to determine if these utterances were the keyword or non-keyword speech. Model performance appeared to be highly related to electrode placement and spatial density. Vowel height (/a/ vs /i/) was poorly discriminated in recordings from sensorimotor cortex, but was highly discriminable using neural features from superior temporal gyrus during self-monitoring. The best performing neural keyword detection (5 keyword detections with two false-positives across 60 utterances) and neural VAD (100% sensitivity, ~1 false detection per 10 utterances) came from high-density (2 mm electrode diameter and 5 mm pitch) recordings from ventral sensorimotor cortex, suggesting the spatial fidelity and extent of high-density ECoG arrays may be sufficient for the purpose of speech brain-computer-interfaces

    Deep learning as a tool for neural data analysis: Speech classification and cross-frequency coupling in human sensorimotor cortex.

    Get PDF
    A fundamental challenge in neuroscience is to understand what structure in the world is represented in spatially distributed patterns of neural activity from multiple single-trial measurements. This is often accomplished by learning a simple, linear transformations between neural features and features of the sensory stimuli or motor task. While successful in some early sensory processing areas, linear mappings are unlikely to be ideal tools for elucidating nonlinear, hierarchical representations of higher-order brain areas during complex tasks, such as the production of speech by humans. Here, we apply deep networks to predict produced speech syllables from a dataset of high gamma cortical surface electric potentials recorded from human sensorimotor cortex. We find that deep networks had higher decoding prediction accuracy compared to baseline models. Having established that deep networks extract more task relevant information from neural data sets relative to linear models (i.e., higher predictive accuracy), we next sought to demonstrate their utility as a data analysis tool for neuroscience. We first show that deep network's confusions revealed hierarchical latent structure in the neural data, which recapitulated the underlying articulatory nature of speech motor control. We next broadened the frequency features beyond high-gamma and identified a novel high-gamma-to-beta coupling during speech production. Finally, we used deep networks to compare task-relevant information in different neural frequency bands, and found that the high-gamma band contains the vast majority of information relevant for the speech prediction task, with little-to-no additional contribution from lower-frequency amplitudes. Together, these results demonstrate the utility of deep networks as a data analysis tool for basic and applied neuroscience

    Neural decoding of spoken vowels from human sensory-motor cortex with high-density electrocorticography

    No full text

    Boosting brain–computer interfaces with functional electrical stimulation: potential applications in people with locked-in syndrome

    Get PDF
    Individuals with a locked-in state live with severe whole-body paralysis that limits their ability to communicate with family and loved ones. Recent advances in brain–computer interface (BCI) technology have presented a potential alternative for these people to communicate by detecting neural activity associated with attempted hand or speech movements and translating the decoded intended movements to a control signal for a computer. A technique that could potentially enrich the communication capacity of BCIs is functional electrical stimulation (FES) of paralyzed limbs and face to restore body and facial movements of paralyzed individuals, allowing to add body language and facial expression to communication BCI utterances. Here, we review the current state of the art of existing BCI and FES work in people with paralysis of body and face and propose that a combined BCI-FES approach, which has already proved successful in several applications in stroke and spinal cord injury, can provide a novel promising mode of communication for locked-in individuals

    MAPPING LANGUAGE FUNCTION AND PREDICTING CORTICAL STIMULATION RESULTS WITH INTRACRANIAL ELECTROENCEPHALOGRAPHY

    Get PDF
    To avoid post-operative language impairments after surgery for drug-resistant epilepsy, clinicians rely primarily on electrocortical stimulation mapping (ESM), but this can trigger afterdischarges, clinical seizures, or cause uncomfortable sensations. Moreover, ESM can be time-consuming and the results are usually all-or-none, complicating their interpretation. These practical limitations have long motivated spatial-temporal analysis of passive intracranial electroencephalographic (iEEG) recordings as an alternative or complementary technique that can map cortical function at all sites simultaneously, resulting in significant time savings without adverse side-effects. However, there has not yet been widespread clinical adoption of passive iEEG for pre-operative language mapping, largely because of a failure to realize the potential advantages of iEEG over ESM and other methods for language mapping. The overall goals of this dissertation were to improve and validate passive iEEG as a method for mapping human language function prior to surgical resection for epilepsy and other brain disorders. This was accomplished through three separate aims. First, a spatial-temporal functional mapping (STFM) system was developed and tested for online passive iEEG mapping, providing immediate mapping feedback to both clinicians and researchers. The system output was compared to ESM and to canonical regions of interest in the human language network. In the second aim, the STFM system was used to study the fine temporal dynamics by which Broca’s area is activated and interacts with other areas of language network during a sentence completion task. This study showed that Broca’s area plays a pivotal role in the coordination of language networks responsible for lexical selection. Finally, the third aim sought to reconcile inconsistencies between the results of STFM and ESM. Agreement between these methods has not been as good for language mapping as it has been for motor mapping, which may be due to propagation of ESM effects to cortical areas connected to the site of stimulation. We used cortico-cortical evoked potentials to estimate the effective connectivity of stimulation sites to other sites in the language network. We found that this method improved the accuracy of STFM in predicting ESM results and helped explain similarities and differences between STFM and ESM language maps

    Mapping Sensorimotor Function and Controlling Upper Limb Neuroprosthetics with Electrocorticography

    Get PDF
    Electrocorticography (ECoG) occupies a unique intermediate niche between microelectrode recordings of single neurons and recordings of whole brain activity via functional magnetic resonance imaging (fMRI). ECoG’s combination of high temporal resolution and wide area coverage make it an ideal modality for both functional brain mapping and brain-machine interface (BMI) for control of prosthetic devices. This thesis demonstrates the utility of ECoG, particularly in high gamma frequencies (70-120 Hz), for passive online mapping of language and motor behaviors, online control of reaching and grasping of an advanced robotic upper limb, and mapping somatosensory digit representations in the postcentral gyrus. The dissertation begins with a brief discussion of the framework for neuroprosthetic control developed by the collaboration between Johns Hopkins and JHU Applied Physics Laboratory (JHU/APL). Second, the methodology behind an online spatial-temporal functional mapping (STFM) system is described. Trial-averaged spatiotemporal maps of high gamma activity were computed during a visual naming and a word reading task. The system output is subsequently shown and compared to stimulation mapping. Third, simultaneous and independent ECoG-based control of reaching and grasping is demonstrated with the Modular Prosthetic Limb (MPL). The STFM system was used to identify channels whose high gamma power significantly and selectively increases during either reaching or grasping. Using this technique, two patients were able to rapidly achieve naturalistic control over simple movements by the MPL. Next, high-density ECoG (hdECoG) was used to map the cortical responses to mechanical vibration of the fingertips. High gamma responses exhibited a strong yet overlapping somatotopy that was not well replicated in other frequency bands. These responses are strong enough to be detected in single trials and used to classify the finger being stimulated with over 98% accuracy. Finally, the role of ECoG is discussed for functional mapping and BMI applications. ECoG occupies a unique role among neural recording modalities as a tool for functional mapping, but must prove its value relative to stimulation mapping. For BMI, ECoG lags microelectrode arrays but hdECoG may provide a more robust long-term interface with optimal spacing for sampling relevant cortical representations

    Neurophysiological mechanisms of sensorimotor recovery from stroke

    Get PDF
    Ischemic stroke often results in the devastating loss of nervous tissue in the cerebral cortex, leading to profound motor deficits when motor territory is lost, and ultimately resulting in a substantial reduction in quality of life for the stroke survivor. The International Classification of Functioning, Disability and Health (ICF) was developed in 2002 by the World Health Organization (WHO) and provides a framework for clinically defining impairment after stroke. While the reduction of burdens due to neurological disease is stated as a mission objective of the National Institute of Neurological Disorders and Stroke (NINDS), recent clinical trials have been unsuccessful in translating preclinical research breakthroughs into actionable therapeutic treatment strategies with meaningful progress towards this goal. This means that research expanding another NINDS mission is now more important than ever: improving fundamental knowledge about the brain and nervous system in order to illuminate the way forward. Past work in the monkey model of ischemic stroke has suggested there may be a relationship between motor improvements after injury and the ability of the animal to reintegrate sensory and motor information during behavior. This relationship may be subserved by sprouting cortical axonal processes that originate in the spared premotor cortex after motor cortical injury in squirrel monkeys. The axons were observed to grow for relatively long distances (millimeters), significantly changing direction so that it appears that they specifically navigate around the injury site and reorient toward the spared sensory cortex. Critically, it remains unknown whether such processes ever form functional synapses, and if they do, whether such synapses perform meaningful calculations or other functions during behavior. The intent of this dissertation was to study this phenomenon in both intact rats and rats with a focal ischemia in primary motor cortex (M1) contralateral to the preferred forelimb during a pellet retrieval task. As this proved to be a challenging and resource-intensive endeavor, a primary objective of the dissertation became to provide the tools to facilitate such a project to begin with. This includes the creation of software, hardware, and novel training and behavioral paradigms for the rat model. At the same time, analysis of previous experimental data suggested that plasticity in the neural activity of the bilateral motor cortices of rats performing pellet retrievals after focal M1 ischemia may exhibit its most salient changes with respect to functional changes in behavior via mechanisms that were different than initially hypothesized. Specifically, a major finding of this dissertation is the finding that evidence of plasticity in the unit activity of bilateral motor cortical areas of the reaching rat is much stronger at the level of population features. These features exhibit changes in dynamics that suggest a shift in network fixed points, which may relate to the stability of filtering performed during behavior. It is therefore predicted that in order to define recovery by comparison to restitution, a specific type of fixed point dynamics must be present in the cortical population state. A final suggestion is that the stability or presence of these dynamics is related to the reintegration of sensory information to the cortex, which may relate to the positive impact of physical therapy during rehabilitation in the postacute window. Although many more rats will be needed to state any of these findings as a definitive fact, this line of inquiry appears to be productive for identifying targets related to sensorimotor integration which may enhance the efficacy of future therapeutic strategies
    corecore