Search CORE

31 research outputs found

Overt speech decoding from cortical activity: a comparison of different linear methods

Author: Blaise Yvert
Florent Bocquelet
Gaël Le Godais
Marc Aubert
Philippe Kahane
Philippe Kahane
Philémon Roussel
Stéphan Chabardès
Stéphan Chabardès
Publication venue: 'Frontiers Media SA'
Publication date: 01/06/2023
Field of study

IntroductionSpeech BCIs aim at reconstructing speech in real time from ongoing cortical activity. Ideal BCIs would need to reconstruct speech audio signal frame by frame on a millisecond-timescale. Such approaches require fast computation. In this respect, linear decoder are good candidates and have been widely used in motor BCIs. Yet, they have been very seldomly studied for speech reconstruction, and never for reconstruction of articulatory movements from intracranial activity. Here, we compared vanilla linear regression, ridge-regularized linear regressions, and partial least squares regressions for offline decoding of overt speech from cortical activity.MethodsTwo decoding paradigms were investigated: (1) direct decoding of acoustic vocoder features of speech, and (2) indirect decoding of vocoder features through an intermediate articulatory representation chained with a real-time-compatible DNN-based articulatory-to-acoustic synthesizer. Participant's articulatory trajectories were estimated from an electromagnetic-articulography dataset using dynamic time warping. The accuracy of the decoders was evaluated by computing correlations between original and reconstructed features.ResultsWe found that similar performance was achieved by all linear methods well above chance levels, albeit without reaching intelligibility. Direct and indirect methods achieved comparable performance, with an advantage for direct decoding.DiscussionFuture work will address the development of an improved neural speech decoder compatible with fast frame-by-frame speech reconstruction from ongoing activity at a millisecond timescale

Directory of Open Access Journals

Neurolinguistics Research Advancing Development of a Direct-Speech Brain-Computer Interface

Author: Alderson-Day
Alderson-Day
Aleman
Baddeley
Badre
Basho
Bates
Benetello
Berwick
Bin
Blakely
Bocquelet
Bouchard
Bouton
Brocklehurst
Brumberg
Brumberg
Chakrabarti
Chaudhary
Chi
Corley
DaSalla
Dell
Dell
Deng
Derix
Derix
Di Liberto
Dolcos
Donchin
D’albis
D’Zmura
Edwards
Emerson
Fargier
Fernyhough
Filik
Frazier
Friederici
Friederici
Friederici
Geva
Geva
Gilbert
González-Castañeda
Goto
Guenther
Hashim
Hauk
Henseler
Herff
Herff
Herff
Herff
Hickok
Hickok
Hickok
Hirshorn
Hochberg
Huang
Hubbard
Hurlburt
Hurlburt
Ibayashi
Ikeda
Iljina
Indefrey
Indefrey
Indefrey
Jackendoff
Jones
Kanas
Kellis
Kielar
Lambon Ralph
Leuthardt
Levelt
Levelt
Levelt
Levelt
Livezey
Llorens
Lotte
Love
MacKay
Mainy
Martin
Martin
Martínez-Manrique
Marvel
McGuire
McGuire
Miyake
Mohr
Morin
Moses
Mugler
Mugler
Nguyen
Oken
Oppenheim
Oppenheim
Palmer
Pandarinath
Paulesu
Paulesu
Pei
Pei
Perrone-Bertolotti
Pickering
Piñango
Price
Pulvermüller
Pulvermüller
Ramsey
Rezazadeh Sereshkeh
Riès
Ruescher
Scott
Sedivy
Segalowitz
Shergill
Shergill
Shuster
Stark
Szekely
Tabar
Tabar
Tian
Torres-García
Vigneau
Wang
Wang
Wheeldon
Whitney
Wildgruber
Yoshimura
Zaccarella
Zaccarella
Zatorre
Zhang
Publication venue: 'Elsevier BV'
Publication date: 01/10/2018
Field of study

A direct-speech brain-computer interface (DS-BCI) acquires neural signals corresponding to imagined speech, then processes and decodes these signals to produce a linguistic output in the form of phonemes, words, or sentences. Recent research has shown the potential of neurolinguistics to enhance decoding approaches to imagined speech with the inclusion of semantics and phonology in experimental procedures. As neurolinguistics research findings are beginning to be incorporated within the scope of DS-BCI research, it is our view that a thorough understanding of imagined speech, and its relationship with overt speech, must be considered an integral feature of research in this field. With a focus on imagined speech, we provide a review of the most important neurolinguistics research informing the field of DS-BCI and suggest how this research may be utilized to improve current experimental protocols and decoding techniques. Our review of the literature supports a cross-disciplinary approach to DS-BCI research, in which neurolinguistics concepts and methods are utilized to aid development of a naturalistic mode of communication. : Cognitive Neuroscience; Computer Science; Hardware Interface Subject Areas: Cognitive Neuroscience, Computer Science, Hardware Interfac

Crossref

Directory of Open Access Journals

Ulster University's Research Portal

Vers une interface cerveau-machine pour la restauration de la parole

Author: Bocquelet Florent
Publication venue: HAL CCSD
Publication date: 24/04/2017
Field of study

Restoring natural speech in paralyzed and aphasic people could be achieved using a brain-computer interface controlling a speech synthesizer in real-time. The aim of this thesis was thus to develop three main steps toward such proof of concept.First, a prerequisite was to develop a speech synthesizer producing intelligible speech in real-time with a reasonable number of control parameters. Here we chose to synthesize speech from movements of the speech articulators since recent studies suggested that neural activity from the speech motor cortex contains relevant information to decode speech, and especially articulatory features of speech. We thus developed a speech synthesizer that produced intelligible speech from articulatory data. This was achieved by first recording a large dataset of synchronous articulatory and acoustic data in a single speaker. Then, we used machine learning techniques, especially deep neural networks, to build a model able to convert articulatory data into speech. This synthesizer was built to run in real time. Finally, as a first step toward future brain control of this synthesizer, we tested that it could be controlled in real-time by several speakers to produce intelligible speech from articulatory movements in a closed-loop paradigm.Second, we investigated the feasibility of decoding speech and articulatory features from neural activity essentially recorded in the speech motor cortex. We built a tool that allowed to localize active cortical speech areas online during awake brain surgery at the Grenoble Hospital and tested this system in two patients with brain cancer. Results show that the motor cortex exhibits specific activity during speech production in the beta and gamma bands, which are also present during speech imagination. The recorded data could be successfully analyzed to decode speech intention, voicing activity and the trajectories of the main articulators of the vocal tract above chance.Finally, we addressed ethical issues that arise with the development and use of brain-computer interfaces. We considered three levels of ethical questionings, dealing respectively with the animal, the human being, and the human species.Restorer la faculté de parler chez des personnes paralysées et aphasiques pourrait être envisagée via l’utilisation d’une interface cerveau-machine permettant de contrôler un synthétiseur de parole en temps réel. L’objectif de cette thèse était de développer trois aspects nécessaires à la mise au point d’une telle preuve de concept.Premièrement, un synthétiseur permettant de produire en temps-réel de la parole intelligible et controlé par un nombre raisonable de paramètres est nécessaire. Nous avons choisi de synthétiser de la parole à partir des mouvements des articulateurs du conduit vocal. En effet, des études récentes ont suggéré que l’activité neuronale du cortex moteur de la parole pourrait contenir suffisamment d’information pour décoder la parole, et particulièrement ses propriété articulatoire (ex. l’ouverture des lèvres). Nous avons donc développé un synthétiseur produisant de la parole intelligible à partir de données articulatoires. Dans un premier temps, nous avons enregistré un large corpus de données articulatoire et acoustiques synchrones chez un locuteur. Ensuite, nous avons utilisé des techniques d’apprentissage automatique, en particulier des réseaux de neurones profonds, pour construire un modèle permettant de convertir des données articulatoires en parole. Ce synthétisuer a été construit pour fonctionner en temps réel. Enfin, comme première étape vers un contrôle neuronal de ce synthétiseur, nous avons testé qu’il pouvait être contrôlé en temps réel par plusieurs locuteurs, pour produire de la parole inetlligible à partir de leurs mouvements articulatoires dans un paradigme de boucle fermée.Deuxièmement, nous avons étudié le décodage de la parole et de ses propriétés articulatoires à partir d’activités neuronales essentiellement enregistrées dans le cortex moteur de la parole. Nous avons construit un outil permettant de localiser les aires corticales actives, en ligne pendant des chirurgies éveillées à l’hôpital de Grenoble, et nous avons testé ce système chez deux patients atteints d’un cancer du cerveau. Les résultats ont montré que le cortex moteur exhibe une activité spécifique pendant la production de parole dans les bandes beta et gamma du signal, y compris lors de l’imagination de la parole. Les données enregistrées ont ensuite pu être analysées pour décoder l’intention de parler du sujet (réelle ou imaginée), ainsi que la vibration des cordes vocales et les trajectoires des articulateurs principaux du conduit vocal significativement au dessus du niveau de la chance.Enfin, nous nous sommes intéressés aux questions éthiques qui accompagnent le développement et l’usage des interfaces cerveau-machine. Nous avons en particulier considéré trois niveaux de réflexion éthique concernant respectivement l’animal, l’humain et l’humanité

Thèses en Ligne

Hal - Université Grenoble Alpes

HAL-Inserm

HAL Descartes

Introduction

Author: H. Bocquelet
Publication venue: 'EDP Sciences'
Publication date: 19/01/2015
Field of study

EDP Sciences OAI-PMH repository (1.2.0)

Directory of Open Access Journals

Toward a brain-computer interface for speech restoration

Author: Bocquelet Florent
Publication venue
Publication date: 24/04/2017
Field of study

Restorer la faculté de parler chez des personnes paralysées et aphasiques pourrait être envisagée via l’utilisation d’une interface cerveau-machine permettant de contrôler un synthétiseur de parole en temps réel. L’objectif de cette thèse était de développer trois aspects nécessaires à la mise au point d’une telle preuve de concept.Premièrement, un synthétiseur permettant de produire en temps-réel de la parole intelligible et controlé par un nombre raisonable de paramètres est nécessaire. Nous avons choisi de synthétiser de la parole à partir des mouvements des articulateurs du conduit vocal. En effet, des études récentes ont suggéré que l’activité neuronale du cortex moteur de la parole pourrait contenir suffisamment d’information pour décoder la parole, et particulièrement ses propriété articulatoire (ex. l’ouverture des lèvres). Nous avons donc développé un synthétiseur produisant de la parole intelligible à partir de données articulatoires. Dans un premier temps, nous avons enregistré un large corpus de données articulatoire et acoustiques synchrones chez un locuteur. Ensuite, nous avons utilisé des techniques d’apprentissage automatique, en particulier des réseaux de neurones profonds, pour construire un modèle permettant de convertir des données articulatoires en parole. Ce synthétisuer a été construit pour fonctionner en temps réel. Enfin, comme première étape vers un contrôle neuronal de ce synthétiseur, nous avons testé qu’il pouvait être contrôlé en temps réel par plusieurs locuteurs, pour produire de la parole inetlligible à partir de leurs mouvements articulatoires dans un paradigme de boucle fermée.Deuxièmement, nous avons étudié le décodage de la parole et de ses propriétés articulatoires à partir d’activités neuronales essentiellement enregistrées dans le cortex moteur de la parole. Nous avons construit un outil permettant de localiser les aires corticales actives, en ligne pendant des chirurgies éveillées à l’hôpital de Grenoble, et nous avons testé ce système chez deux patients atteints d’un cancer du cerveau. Les résultats ont montré que le cortex moteur exhibe une activité spécifique pendant la production de parole dans les bandes beta et gamma du signal, y compris lors de l’imagination de la parole. Les données enregistrées ont ensuite pu être analysées pour décoder l’intention de parler du sujet (réelle ou imaginée), ainsi que la vibration des cordes vocales et les trajectoires des articulateurs principaux du conduit vocal significativement au dessus du niveau de la chance.Enfin, nous nous sommes intéressés aux questions éthiques qui accompagnent le développement et l’usage des interfaces cerveau-machine. Nous avons en particulier considéré trois niveaux de réflexion éthique concernant respectivement l’animal, l’humain et l’humanité.Restoring natural speech in paralyzed and aphasic people could be achieved using a brain-computer interface controlling a speech synthesizer in real-time. The aim of this thesis was thus to develop three main steps toward such proof of concept.First, a prerequisite was to develop a speech synthesizer producing intelligible speech in real-time with a reasonable number of control parameters. Here we chose to synthesize speech from movements of the speech articulators since recent studies suggested that neural activity from the speech motor cortex contains relevant information to decode speech, and especially articulatory features of speech. We thus developed a speech synthesizer that produced intelligible speech from articulatory data. This was achieved by first recording a large dataset of synchronous articulatory and acoustic data in a single speaker. Then, we used machine learning techniques, especially deep neural networks, to build a model able to convert articulatory data into speech. This synthesizer was built to run in real time. Finally, as a first step toward future brain control of this synthesizer, we tested that it could be controlled in real-time by several speakers to produce intelligible speech from articulatory movements in a closed-loop paradigm.Second, we investigated the feasibility of decoding speech and articulatory features from neural activity essentially recorded in the speech motor cortex. We built a tool that allowed to localize active cortical speech areas online during awake brain surgery at the Grenoble Hospital and tested this system in two patients with brain cancer. Results show that the motor cortex exhibits specific activity during speech production in the beta and gamma bands, which are also present during speech imagination. The recorded data could be successfully analyzed to decode speech intention, voicing activity and the trajectories of the main articulators of the vocal tract above chance.Finally, we addressed ethical issues that arise with the development and use of brain-computer interfaces. We considered three levels of ethical questionings, dealing respectively with the animal, the human being, and the human species

Theses.fr

Real-Time Control of an Articulatory-Based Speech Synthesizer for Brain Computer Interfaces

Author: Bocquelet Florent
Girin Laurent
Hueber Thomas
Savariaux Christophe
Yvert Blaise
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 23/09/2016
Field of study

International audienceRestoring natural speech in paralyzed and aphasic people could be achieved using a Brain-Computer Interface (BCI) controlling a speech synthesizer in real-time. To reach this goal, a prerequisite is to develop a speech synthesizer producing intelligible speech in real-time with a reasonable number of control parameters. We present here an articulatory-based speech synthesizer that can be controlled in real-time for future BCI applications. This synthesizer converts movements of the main speech articulators (tongue, jaw, velum, and lips) into intelligible speech. The articulatory-to-acoustic mapping is performed using a deep neural network (DNN) trained on electromagnetic articulography (EMA) data recorded on a reference speaker synchronously with the produced speech signal. This DNN is then used in both offline and online modes to map the position of sensors glued on different speech articulators into acoustic parameters that are further converted into an audio signal using a vocoder. In offline mode, highly intelligible speech could be obtained as assessed by perceptual evaluation performed by 12 listeners. Then, to anticipate future BCI applications, we further assessed the real-time control of the synthesizer by both the reference speaker and new speakers, in a closed-loop paradigm using EMA data recorded in real time. A short calibration period was used to compensate for differences in sensor positions and articulatory differences between new speakers and the reference speaker. We found that real-time synthesis of vowels and consonants was possible with good intelligibility. In conclusion, these results open to future speech BCI applications using such articulatory-based speech synthesizer

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Directory of Open Access Journals

PubMed Central

FigShare

Tongue Tracking in Ultrasound Images using EigenTongue Decomposition and Artificial Neural Networks

Author: Badin Pierre
Bocquelet Florent
Fabre Diandra
Hueber Thomas
Publication venue: HAL CCSD
Publication date: 06/09/2015
Field of study

International audienceThis paper describes a machine learning approach for extracting automatically the tongue contour in ultrasound images. This method is developed in the context of visual articulatory biofeedback for speech therapy. The goal is to provide a speaker with an intuitive visualization of his/her tongue movement, in real-time, and with minimum human intervention. Contrary to most widely used techniques based on active contours, the proposed method aims at exploiting the information of all image pixels to infer the tongue contour. For that purpose, a compact representation of each image is extracted using a PCA-based decomposition technique (named EigenTongue). Artificial neural networks are then used to convert the extracted visual features into control parameters of a PCA-based tongue contour model. The proposed method is evaluated on 9 speakers, using data recorded with the ultrasound probe hold manually (as in the targeted application). Speaker-dependent experiments demonstrated the effectiveness of the proposed method (with an average error of ~1.3 mm when training from 80 manually annotated images), even when the tongue contour is poorly imaged. The performance was significantly lower in speaker-independent experiments (i.e. when estimating contours on an unknown speaker), likely due to anatomical differences across speakers

Hal - Université Grenoble Alpes

HAL-Inserm

HAL-CEA

Key considerations in designing a speech brain-computer interface

Author: Bocquelet Florent
Chabardès Stephan
Girin Laurent
Hueber Thomas
Yvert Blaise
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

International audienceRestoring communication in case of aphasia is a key challenge for neurotechnologies. To this end, brain-computer strategies can be envisioned to allow artificial speech synthesis from the continuous decoding of neural signals underlying speech imagination. Such speech brain-computer interfaces do not exist yet and their design should consider three key choices that need to be made: the choice of appropriate brain regions to record neural activity from, the choice of an appropriate recording technique, and the choice of a neural decoding scheme in association with an appropriate speech synthesis method. These key considerations are discussed here in light of (1) the current understanding of the functional neuroanatomy of cortical areas underlying overt and covert speech production, (2) the available literature making use of a variety of brain recording techniques to better characterize and address the challenge of decoding cor-tical speech signals, and (3) the different speech synthesis approaches that can be considered depending on the level of speech representation (phonetic, acoustic or articulatory) envisioned to be decoded at the core of a speech BCI paradigm

Hal - Université Grenoble Alpes

Robust Articulatory Speech Synthesis using Deep Neural Networks for BCI Applications

Author: Badin Pierre
Bocquelet Florent
Girin Laurent
Hueber Thomas
Yvert Blaise
Publication venue: HAL CCSD
Publication date: 14/09/2014
Field of study

International audienceBrain-Computer Interfaces (BCIs) usually propose typing strategies to restore communication for paralyzed and aphasic people. A more natural way would be to use speech BCI directly controlling a speech synthesizer. Toward this goal, a prerequisite is the development a synthesizer that should i) produce intelligible speech, ii) run in real time, iii) depend on as few parameters as possible, and iv) be robust to error fluctuations on the control parameters. In this context, we describe here an articulatory-to-acoustic mapping approach based on deep neural network (DNN) trained on electromagnetic articulography (EMA) data recorded synchronously with produced speech sounds. On this corpus, the DNN-based model provided a speech synthesis quality (as assessed by automatic speech recognition and behavioral testing) comparable to a state-of-the-art Gaussian mixture model (GMM), yet showing higher robustness when noise was added to the EMA coordinates. Moreover, to envision BCI applications, this robustness was also assessed when the space covered by the 12 original articulatory parameters was reduced to 7 parameters using deep auto-encoders (DAE). Given that this method can be implemented in real time, DNN-based articulatory speech synthesis seems a good candidate for speech BCI applications

Hal - Université Grenoble Alpes

HAL-Inserm

HAL-CEA