Search CORE

1,025 research outputs found

Integrating user-centred design in the development of a silent speech interface based on permanent magnetic articulography

Author: Bai Jie
Cheah Lam A.
Ell Stephen R.
Fagan Michael J.
Gilbert James M.
Gonzalez Jose A.
Green Phil D.
Moore Roger K.
Rychenko Sergey I.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Abstract: A new wearable silent speech interface (SSI) based on Permanent Magnetic Articulography (PMA) was developed with the involvement of end users in the design process. Hence, desirable features such as appearance, port-ability, ease of use and light weight were integrated into the prototype. The aim of this paper is to address the challenges faced and the design considerations addressed during the development. Evaluation on both hardware and speech recognition performances are presented here. The new prototype shows a com-parable performance with its predecessor in terms of speech recognition accuracy (i.e. ~95% of word accuracy and ~75% of sequence accuracy), but significantly improved appearance, portability and hardware features in terms of min-iaturization and cost

Repository@Hull - Worktribe

Crossref

Silent Speech Interfaces for Speech Restoration: A Review

Author: González López José Andrés
Gómez Alanís Alejandro
Gómez Ángel M.
Martín Doñas Juan M.
Pérez Córdoba José Luis
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/09/2020
Field of study

This work was supported in part by the Agencia Estatal de Investigacion (AEI) under Grant PID2019-108040RB-C22/AEI/10.13039/501100011033. The work of Jose A. Gonzalez-Lopez was supported in part by the Spanish Ministry of Science, Innovation and Universities under Juan de la Cierva-Incorporation Fellowship (IJCI-2017-32926).This review summarises the status of silent speech interface (SSI) research. SSIs rely on non-acoustic biosignals generated by the human body during speech production to enable communication whenever normal verbal communication is not possible or not desirable. In this review, we focus on the first case and present latest SSI research aimed at providing new alternative and augmentative communication methods for persons with severe speech disorders. SSIs can employ a variety of biosignals to enable silent communication, such as electrophysiological recordings of neural activity, electromyographic (EMG) recordings of vocal tract movements or the direct tracking of articulator movements using imaging techniques. Depending on the disorder, some sensing techniques may be better suited than others to capture speech-related information. For instance, EMG and imaging techniques are well suited for laryngectomised patients, whose vocal tract remains almost intact but are unable to speak after the removal of the vocal folds, but fail for severely paralysed individuals. From the biosignals, SSIs decode the intended message, using automatic speech recognition or speech synthesis algorithms. Despite considerable advances in recent years, most present-day SSIs have only been validated in laboratory settings for healthy users. Thus, as discussed in this paper, a number of challenges remain to be addressed in future research before SSIs can be promoted to real-world applications. If these issues can be addressed successfully, future SSIs will improve the lives of persons with severe speech impairments by restoring their communication capabilities.Agencia Estatal de Investigacion (AEI) PID2019-108040RB-C22/AEI/10.13039/501100011033Spanish Ministry of Science, Innovation and Universities under Juan de la Cierva-Incorporation Fellowship IJCI-2017-3292

arXiv.org e-Print Archive

Repositorio Institucional Universidad de Granada

Neural Speaker Embeddings for Ultrasound-based Silent Speech Interfaces

Author: Csapó Tamás Gábor
Gosztolya Gábor
Markó Alexandra
Shandiz Amin Honarmandi
Tóth László
Publication venue
Publication date: 01/01/2021
Field of study

Articulatory-to-acoustic mapping seeks to reconstruct speech from a recording of the articulatory movements, for example, an ultrasound video. Just like speech signals, these recordings represent not only the linguistic content, but are also highly specific to the actual speaker. Hence, due to the lack of multi-speaker data sets, researchers have so far concentrated on speaker-dependent modeling. Here, we present multi-speaker experiments using the recently published TaL80 corpus. To model speaker characteristics, we adjusted the x-vector framework popular in speech processing to operate with ultrasound tongue videos. Next, we performed speaker recognition experiments using 50 speakers from the corpus. Then, we created speaker embedding vectors and evaluated them on the remaining speakers. Finally, we examined how the embedding vector influences the accuracy of our ultrasound-to-speech conversion network in a multi-speaker scenario. In the experiments we attained speaker recognition error rates below 3%, and we also found that the embedding vectors generalize nicely to unseen speakers. Our first attempt to apply them in a multi-speaker silent speech framework brought about a marginal reduction in the error rate of the spectral estimation step.Comment: 5 pages, 3 figures, 3 table

arXiv.org e-Print Archive

Repository of the Academy's Library

A high-performance speech neuroprosthesis

Author: Avansino Donald T
Choi Eun Young
Druckmann Shaul
Fan Chaofei
Glasser Matthew F
Henderson Jaimie M
Hochberg Leigh R
Kamdar Foram
Kunz Erin M
Shenoy Krishna V
Willett Francis R
Wilson Guy H
Publication venue: Digital Commons@Becker
Publication date: 01/08/2023
Field of study

Speech brain-computer interfaces (BCIs) have the potential to restore rapid communication to people with paralysis by decoding neural activity evoked by attempted speech into tex

Digital Commons@Becker

Neural Speaker Embeddings for Ultrasound-Based Silent Speech Interfaces

Author: Csapó Tamás Gábor
Gosztolya Gábor
Honarmandi Shandiz Amin
Markó Alexandra
Tóth László
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2021
Field of study

SZTE Publicatio Repozitórium - SZTE - Repository of Publications

Automatic Speech Recognition based on Electromyographic Biosignals

Author: Jou Szu-Chen Stan
Schultz Tanja
Publication venue: Springer Verlag
Publication date: 01/01/2008
Field of study

Crossref

KITopen

Direct Speech Reconstruction From Articulatory Sensor Data by Machine Learning

Author: Cheah L.A.
Ell S.R.
Gilbert J.M.
Gomez A.M.
Gonzalez J.A.
Green P.D.
Holdsworth E.
Moore R.K.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/11/2017
Field of study

This paper describes a technique that generates speech acoustics from articulator movements. Our motivation is to help people who can no longer speak following laryngectomy, a procedure that is carried out tens of thousands of times per year in the Western world. Our method for sensing articulator movement, permanent magnetic articulography, relies on small, unobtrusive magnets attached to the lips and tongue. Changes in magnetic field caused by magnet movements are sensed and form the input to a process that is trained to estimate speech acoustics. In the experiments reported here this “Direct Synthesis” technique is developed for normal speakers, with glued-on magnets, allowing us to train with parallel sensor and acoustic data. We describe three machine learning techniques for this task, based on Gaussian mixture models, deep neural networks, and recurrent neural networks (RNNs). We evaluate our techniques with objective acoustic distortion measures and subjective listening tests over spoken sentences read from novels (the CMU Arctic corpus). Our results show that the best performing technique is a bidirectional RNN (BiRNN), which employs both past and future contexts to predict the acoustics from the sensor data. BiRNNs are not suitable for synthesis in real time but fixed-lag RNNs give similar results and, because they only look a little way into the future, overcome this problem. Listening tests show that the speech produced by this method has a natural quality that preserves the identity of the speaker. Furthermore, we obtain up to 92% intelligibility on the challenging CMU Arctic material. To our knowledge, these are the best results obtained for a silent-speech system without a restricted vocabulary and with an unobtrusive device that delivers audio in close to real time. This work promises to lead to a technology that truly will give people whose larynx has been removed their voices back

Repository@Hull - Worktribe

Crossref

White Rose Research Online

Improvements of Silent Speech Interface Algorithms

Author: Honarmandi Shandiz Amin
Publication venue
Publication date
Field of study

SZTE Doktori Értekezések Repozitórium (SZTE Repository of Dissertations)

Towards an Intraoral-Based Silent Speech Restoration System for Post-laryngectomy Voice Replacement

Author: B Denby
DSA Braz
H Doi
H Liu
H Park
H Tang
JL Martin
JM Gilbert
JS Brumberg
JS Brumberg
LR Rabiner
M Wand
MJ Fagan
R Hofe
S Young
T Hueber
T Toda
T Toda
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/03/2017
Field of study

© Springer International Publishing AG 2017, Silent Speech Interfaces (SSIs) are alternative assistive speech technologies that are capable of restoring speech communication for those individuals who have lost their voice due to laryngectomy or diseases affecting the vocal cords. However, many of these SSIs are still deemed as impractical due to a high degree of intrusiveness and discomfort, hence limiting their transition to outside of the laboratory environment. We aim to address the hardware challenges faced in developing a practical SSI for post-laryngectomy speech rehabilitation. A new Permanent Magnet Articulography (PMA) system is presented which fits within the palatal cavity of the user’s mouth, giving unobtrusive appearance and high portability. The prototype is comprised of a miniaturized circuit constructed using commercial off-the-shelf (COTS) components and is implemented in the form of a dental retainer, which is mounted under roof of the user’s mouth and firmly clasps onto the upper teeth. Preliminary evaluation via speech recognition experiments demonstrates that the intraoral prototype achieves reasonable word recognition accuracy and is comparable to the external PMA version. Moreover, the intraoral design is expected to improve on its stability and robustness, with a much improved appearance since it can be completely hidden inside the user’s mouth

Repository@Hull - Worktribe

Crossref