Search CORE

12 research outputs found

Advancing Speech Recognition With No Speech Or With Noisy Speech

Author: Carnahan Mason
Krishna Gautam
Tewfik Ahmed H
Tran Co
Publication venue
Publication date: 14/03/2020
Field of study

In this paper we demonstrate end-to-end continuous speech recognition (CSR) using electroencephalography (EEG) signals with no speech signal as input. An attention model based automatic speech recognition (ASR) and connectionist temporal classification (CTC) based ASR systems were implemented for performing recognition. We further demonstrate CSR for noisy speech by fusing with EEG features.Comment: Extended version of our accepted IEEE EUSIPCO 2019 paper with additional results for CTC model based recognition. arXiv admin note: substantial text overlap with arXiv:1906.08045, arXiv:1906.0804

arXiv.org e-Print Archive

Predicting Video features from EEG and Vice versa

Author: Carnahan Mason
Krishna Gautam
Tewfik Ahmed
Tran Co
Publication venue
Publication date: 16/05/2020
Field of study

In this paper we explore predicting facial or lip video features from electroencephalography (EEG) features and predicting EEG features from recorded facial or lip video frames using deep learning models. The subjects were asked to read out loud English sentences shown to them on a computer screen and their simultaneous EEG signals and facial video frames were recorded. Our model was able to generate very broad characteristics of the facial or lip video frame from input EEG features. Our results demonstrate the first step towards synthesizing high quality facial or lip video from recorded EEG features. We demonstrate results for a data set consisting of seven subjects.Comment: under revie

arXiv.org e-Print Archive

Continuous Silent Speech Recognition using EEG

Author: Carnahan Mason
Krishna Gautam
Tewfik Ahmed
Tran Co
Publication venue
Publication date: 04/05/2020
Field of study

In this paper we explore continuous silent speech recognition using electroencephalography (EEG) signals. We implemented a connectionist temporal classification (CTC) automatic speech recognition (ASR) model to translate EEG signals recorded in parallel while subjects were reading English sentences in their mind without producing any voice to text. Our results demonstrate the feasibility of using EEG signals for performing continuous silent speech recognition. We demonstrate our results for a limited English vocabulary consisting of 30 unique sentences

arXiv.org e-Print Archive

EEG based Continuous Speech Recognition using Transformers

Author: Carnahan Mason
Krishna Gautam
Tewfik Ahmed H
Tran Co
Publication venue
Publication date: 05/05/2020
Field of study

In this paper we investigate continuous speech recognition using electroencephalography (EEG) features using recently introduced end-to-end transformer based automatic speech recognition (ASR) model. Our results demonstrate that transformer based model demonstrate faster training compared to recurrent neural network (RNN) based sequence-to-sequence EEG models and better performance during inference time for smaller test set vocabulary but as we increase the vocabulary size, the performance of the RNN based models were better than transformer based model on a limited English vocabulary

arXiv.org e-Print Archive

Speech Recognition using EEG signals recorded using dry electrodes

Author: Carnahan Mason
Hagood Morgan M
Krishna Gautam
Tewfik Ahmed H
Tran Co
Publication venue
Publication date: 13/08/2020
Field of study

In this paper, we demonstrate speech recognition using electroencephalography (EEG) signals obtained using dry electrodes on a limited English vocabulary consisting of three vowels and one word using a deep learning model. We demonstrate a test accuracy of 79.07 percent on a subset vocabulary consisting of two English vowels. Our results demonstrate the feasibility of using EEG signals recorded using dry electrodes for performing the task of speech recognition

arXiv.org e-Print Archive

Continuous Speech Recognition using EEG and Video

Author: Carnahan Mason
Krishna Gautam
Tewfik Ahmed H
Tran Co
Publication venue
Publication date: 27/12/2019
Field of study

In this paper we investigate whether electroencephalography (EEG) features can be used to improve the performance of continuous visual speech recognition systems. We implemented a connectionist temporal classification (CTC) based end-to-end automatic speech recognition (ASR) model for performing recognition. Our results demonstrate that EEG features are helpful in enhancing the performance of continuous visual speech recognition systems.Comment: On preparation for submission to EUSIPCO 2020. arXiv admin note: text overlap with arXiv:1911.11610, arXiv:1911.0426

arXiv.org e-Print Archive

Speech Synthesis using EEG

Author: Carnahan Mason
Han Yan
Krishna Gautam
Tran Co
Publication venue
Publication date: 03/05/2020
Field of study

In this paper we demonstrate speech synthesis using different electroencephalography (EEG) feature sets recently introduced in [1]. We make use of a recurrent neural network (RNN) regression model to predict acoustic features directly from EEG features. We demonstrate our results using EEG features recorded in parallel with spoken speech as well as using EEG recorded in parallel with listening utterances. We provide EEG based speech synthesis results for four subjects in this paper and our results demonstrate the feasibility of synthesizing speech directly from EEG features.Comment: Accepted for publication at IEEE ICASSP 202

arXiv.org e-Print Archive

Voice Activity Detection in presence of background noise using EEG

Author: Carnahan Mason
Han Yan
Krishna Gautam
Tewfik Ahmed H
Tran Co
Publication venue
Publication date: 14/03/2020
Field of study

In this paper we demonstrate that performance of voice activity detection (VAD) system operating in presence of background noise can be improved by concatenating acoustic input features with electroencephalography (EEG) features. We also demonstrate that VAD using only EEG features shows better performance than VAD using only acoustic features in presence of background noise. We implemented a recurrent neural network (RNN) based VAD system and we demonstrate our results for two different data sets recorded in presence of different noise conditions in this paper. We finally demonstrate the ability to predict whether a person wish to continue speaking a sentence or not from EEG features.Comment: On preparation for submission to EUSIPCO 2020. arXiv admin note: text overlap with arXiv:1906.08871, arXiv:1909.0913

arXiv.org e-Print Archive

Spoken Speech Enhancement using EEG

Author: Carnahan Mason
Han Yan
Krishna Gautam
Tewfik Ahmed H
Tran Co
Publication venue
Publication date: 19/04/2020
Field of study

In this paper we demonstrate spoken speech enhancement using electroencephalography (EEG) signals using a generative adversarial network (GAN) based model, gated recurrent unit (GRU) regression based model, temporal convolutional network (TCN) regression model and finally using a mixed TCN GRU regression model. We compare our EEG based speech enhancement results with traditional log minimum mean-square error (MMSE) speech enhancement algorithm and our proposed methods demonstrate significant improvement in speech enhancement quality compared to the traditional method. Our overall results demonstrate that EEG features can be used to clean speech recorded in presence of background noise. To the best of our knowledge this is the first time a spoken speech enhancement is demonstrated using EEG features recorded in parallel with spoken speech

arXiv.org e-Print Archive

Generating EEG features from Acoustic features

Author: Carnahan Mason
Han Yan
Krishna Gautam
Tewfik Ahmed H
Tran Co
Publication venue
Publication date: 18/03/2020
Field of study

In this paper we demonstrate predicting electroencephalograpgy (EEG) features from acoustic features using recurrent neural network (RNN) based regression model and generative adversarial network (GAN). We predict various types of EEG features from acoustic features. We compare our results with the previously studied problem on speech synthesis using EEG and our results demonstrate that EEG features can be generated from acoustic features with lower root mean square error (RMSE), normalized RMSE values compared to generating acoustic features from EEG features (ie: speech synthesis using EEG) when tested using the same data sets

arXiv.org e-Print Archive