23 research outputs found
Speech Recognition with no speech or with noisy speech
The performance of automatic speech recognition systems(ASR) degrades in the
presence of noisy speech. This paper demonstrates that using
electroencephalography (EEG) can help automatic speech recognition systems
overcome performance loss in the presence of noise. The paper also shows that
distillation training of automatic speech recognition systems using EEG
features will increase their performance. Finally, we demonstrate the ability
to recognize words from EEG with no speech signal on a limited English
vocabulary with high accuracy.Comment: Accepted for ICASSP 201
Advancing Speech Recognition With No Speech Or With Noisy Speech
In this paper we demonstrate end-to-end continuous speech recognition (CSR)
using electroencephalography (EEG) signals with no speech signal as input. An
attention model based automatic speech recognition (ASR) and connectionist
temporal classification (CTC) based ASR systems were implemented for performing
recognition. We further demonstrate CSR for noisy speech by fusing with EEG
features.Comment: Extended version of our accepted IEEE EUSIPCO 2019 paper with
additional results for CTC model based recognition. arXiv admin note:
substantial text overlap with arXiv:1906.08045, arXiv:1906.0804
Speech Recognition With No Speech Or With Noisy Speech Beyond English
In this paper we demonstrate continuous noisy speech recognition using
connectionist temporal classification (CTC) model on limited Chinese vocabulary
using electroencephalography (EEG) features with no speech signal as input and
we further demonstrate single CTC model based continuous noisy speech
recognition on limited joint English and Chinese vocabulary using EEG features
with no speech signal as input. We demonstrate our results using various EEG
feature sets recently introduced in [1] as well as we propose a new deep
learning architecture in this paper which can perform continuous speech
recognition using raw EEG signals on limited joint English and Chinese
vocabulary.Comment: arXiv admin note: text overlap with arXiv:1906.0887
Predicting Video features from EEG and Vice versa
In this paper we explore predicting facial or lip video features from
electroencephalography (EEG) features and predicting EEG features from recorded
facial or lip video frames using deep learning models. The subjects were asked
to read out loud English sentences shown to them on a computer screen and their
simultaneous EEG signals and facial video frames were recorded. Our model was
able to generate very broad characteristics of the facial or lip video frame
from input EEG features. Our results demonstrate the first step towards
synthesizing high quality facial or lip video from recorded EEG features. We
demonstrate results for a data set consisting of seven subjects.Comment: under revie
Continuous Silent Speech Recognition using EEG
In this paper we explore continuous silent speech recognition using
electroencephalography (EEG) signals. We implemented a connectionist temporal
classification (CTC) automatic speech recognition (ASR) model to translate EEG
signals recorded in parallel while subjects were reading English sentences in
their mind without producing any voice to text. Our results demonstrate the
feasibility of using EEG signals for performing continuous silent speech
recognition. We demonstrate our results for a limited English vocabulary
consisting of 30 unique sentences
Spoken Speech Enhancement using EEG
In this paper we demonstrate spoken speech enhancement using
electroencephalography (EEG) signals using a generative adversarial network
(GAN) based model, gated recurrent unit (GRU) regression based model, temporal
convolutional network (TCN) regression model and finally using a mixed TCN GRU
regression model.
We compare our EEG based speech enhancement results with traditional log
minimum mean-square error (MMSE) speech enhancement algorithm and our proposed
methods demonstrate significant improvement in speech enhancement quality
compared to the traditional method. Our overall results demonstrate that EEG
features can be used to clean speech recorded in presence of background noise.
To the best of our knowledge this is the first time a spoken speech enhancement
is demonstrated using EEG features recorded in parallel with spoken speech
Speech Recognition using EEG signals recorded using dry electrodes
In this paper, we demonstrate speech recognition using electroencephalography
(EEG) signals obtained using dry electrodes on a limited English vocabulary
consisting of three vowels and one word using a deep learning model. We
demonstrate a test accuracy of 79.07 percent on a subset vocabulary consisting
of two English vowels. Our results demonstrate the feasibility of using EEG
signals recorded using dry electrodes for performing the task of speech
recognition
EEG based Continuous Speech Recognition using Transformers
In this paper we investigate continuous speech recognition using
electroencephalography (EEG) features using recently introduced end-to-end
transformer based automatic speech recognition (ASR) model. Our results
demonstrate that transformer based model demonstrate faster training compared
to recurrent neural network (RNN) based sequence-to-sequence EEG models and
better performance during inference time for smaller test set vocabulary but as
we increase the vocabulary size, the performance of the RNN based models were
better than transformer based model on a limited English vocabulary
Continuous Speech Recognition using EEG and Video
In this paper we investigate whether electroencephalography (EEG) features
can be used to improve the performance of continuous visual speech recognition
systems. We implemented a connectionist temporal classification (CTC) based
end-to-end automatic speech recognition (ASR) model for performing recognition.
Our results demonstrate that EEG features are helpful in enhancing the
performance of continuous visual speech recognition systems.Comment: On preparation for submission to EUSIPCO 2020. arXiv admin note: text
overlap with arXiv:1911.11610, arXiv:1911.0426
Speech Synthesis using EEG
In this paper we demonstrate speech synthesis using different
electroencephalography (EEG) feature sets recently introduced in [1]. We make
use of a recurrent neural network (RNN) regression model to predict acoustic
features directly from EEG features. We demonstrate our results using EEG
features recorded in parallel with spoken speech as well as using EEG recorded
in parallel with listening utterances. We provide EEG based speech synthesis
results for four subjects in this paper and our results demonstrate the
feasibility of synthesizing speech directly from EEG features.Comment: Accepted for publication at IEEE ICASSP 202