Search CORE

19 research outputs found

Advancing Electromyographic Continuous Speech Recognition: Signal Preprocessing and Modeling

Author: Wand Michael
Publication venue: KIT Scientific Publishing, Karlsruhe
Publication date: 01/01/2014
Field of study

Speech is the natural medium of human communication, but audible speech can be overheard by bystanders and excludes speech-disabled people. This work presents a speech recognizer based on surface electromyography, where electric potentials of the facial muscles are captured by surface electrodes, allowing speech to be processed nonacoustically. A system which was state-of-the-art at the beginning of this book is substantially improved in terms of accuracy, flexibility, and robustness

KITopen

Directory of Open Access Books (DOAB)

Advancing Electromyographic Continuous Speech Recognition: Signal Preprocessing and Modeling

Author: Wand Michael
Publication venue: KIT Scientific Publishing
Publication date: 30/07/2019
Field of study

Directory of Open Access Books (DOAB)

Frame-Based Phone Classification Using EMG Signals

Author: De Zuazo Oteiza Xabier
Del Blanco Sierra Eder
Hernáez Rioja Inmaculada
Navas Cordón Eva
Salomons Inge
Publication venue: MDPI
Publication date: 13/07/2023
Field of study

This paper evaluates the impact of inter-speaker and inter-session variability on the development of a silent speech interface (SSI) based on electromyographic (EMG) signals from the facial muscles. The final goal of the SSI is to provide a communication tool for Spanish-speaking laryngectomees by generating audible speech from voiceless articulation. However, before moving on to such a complex task, a simpler phone classification task in different modalities regarding speaker and session dependency is performed for this study. These experiments consist of processing the recorded utterances into phone-labeled segments and predicting the phonetic labels using only features obtained from the EMG signals. We evaluate and compare the performance of each model considering the classification accuracy. Results show that the models are able to predict the phonetic label best when they are trained and tested using data from the same session. The accuracy drops drastically when the model is tested with data from a different session, although it improves when more data are added to the training data. Similarly, when the same model is tested on a session from a different speaker, the accuracy decreases. This suggests that using larger amounts of data could help to reduce the impact of inter-session variability, but more research is required to understand if this approach would suffice to account for inter-speaker variability as well.This research was funded by Agencia Estatal de Investigación grant number ref.PID2019-108040RB-C21/AEI/10.13039/50110001103

Archivo Digital para la Docencia y la Investigación

Deep Learning for Processing Electromyographic Signals: a Taxonomy-based Survey

Author: Antonio Brunetti
Domenico Buongiorno
Giacomo Donato Cascarano
Giovanni Dimauro
Irio De Feudis
Leonarda Carnimeo
Vitoantonio Bevilacqua
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

Deep Learning (DL) has been recently employed to build smart systems that perform incredibly well in a wide range of tasks, such as image recognition, machine translation, and self-driving cars. In several fields the considerable improvement in the computing hardware and the increasing need for big data analytics has boosted DL work. In recent years physiological signal processing has strongly benefited from deep learning. In general, there is an exponential increase in the number of studies concerning the processing of electromyographic (EMG) signals using DL methods. This phenomenon is mostly explained by the current limitation of myoelectric controlled prostheses as well as the recent release of large EMG recording datasets, e.g. Ninapro. Such a growing trend has inspired us to seek and review recent papers focusing on processing EMG signals using DL methods. Referring to the Scopus database, a systematic literature search of papers published between January 2014 and March 2019 was carried out, and sixty-five papers were chosen for review after a full text analysis. The bibliometric research revealed that the reviewed papers can be grouped in four main categories according to the final application of the EMG signal analysis: Hand Gesture Classification, Speech and Emotion Classification, Sleep Stage Classification and Other Applications. The review process also confirmed the increasing trend in terms of published papers, the number of papers published in 2018 is indeed four times the amount of papers published the year before. As expected, most of the analyzed papers (≈60 %) concern the identification of hand gestures, thus supporting our hypothesis. Finally, it is worth reporting that the convolutional neural network (CNN) is the most used topology among the several involved DL architectures, in fact, the sixty percent approximately of the reviewed articles consider a CNN

Archivio istituzionale della ricerca - Università di Bari

Brain-Computer Interface and Silent Speech Recognition on Decentralized Messaging Applications

Author: Lourenço Fábio André Vilar
Publication venue
Publication date: 01/01/2020
Field of study

Online communications have been increasingly gaining prevalence in people’s daily lives, with its widespread adoption being catalyzed by technological advances, especially in instant messaging platforms. Although there have been strides for the inclusion of disabled individuals to ease communication between peers, people who suffer hand/arm impairments have little to no support in regular mainstream applications to efficiently communicate with other individuals. Moreover, a problem with the current solutions that fall back on speech-to-text techniques is the lack of privacy when the usage of these alternatives is conducted in public. Additionally, as centralized systems have come into scrutiny regarding privacy and security, the development of alternative decentralized solutions has increased by the use of blockchain technology and its variants. Within the inclusivity paradigm, this project showcases an alternative on human-computer interaction with support for the aforementioned disabled people, through the use of a braincomputer interface allied to a silent speech recognition system, for application navigation and text input purposes, respectively. A brain-computer interface allows a user to interact with the platform just by though, while the silent speech recognition system enables the input of text by reading activity from articulatory muscles without the need of actually speaking audibly. Therefore, the combination of both techniques creates a full hands-free interaction with the platform, empowering hand/arm disabled users in daily life communications. Furthermore, the users of the application will be inserted in a decentralized system that is designed for secure communication and exchange of data between peers, enforcing the privacy concern that is a cornerstone of the platform.Comunicações online têm cada vez mais ganhado prevalência na vida contemporânea de pessoas, tendo a sua adoção sido catalisada pelos avanços tecnológicos, especialmente em plataformas de mensagens instantâneas. Embora tenham havido desenvolvimentos relativamente à inclusão de indivíduos com deficiência para facilitar a comunicação entre pessoas, as que sofrem de incapacidades motoras nas mãos/braços têm um suporte escasso em aplicações convencionais para comunicar de forma eficiente com outros sujeitos. Além disso, um problema com as soluções atuais que recorrem a técnicas de voz-para-texto é a falta de privacidade nas comunicações quando usadas em público. Adicionalmente, há medida que sistemas centralizados têm atraído ceticismo relativamente à privacidade e segurança, o desenvolvimento de soluções descentralizadas e alternativas têm aumentado pelo uso de tecnologias de blockchain e as suas variantes. Dentro do paradigma de inclusão, este projeto demonstras uma alternativa na interação humano-computador com suporte para os indivíduos referidos anteriormente, através do uso de uma interface cérebro-computador aliada a um sistema de reconhecimento de fala silenciosa, para navegação na aplicação e introdução de texto, respetivamente. Uma interface cérebro-computador permite o utilizador interagir com a plataforma apenas através do pensamento, enquanto que um sistema de reconhecimento de fala silenciosa possibilita a introdução de texto pela leitura da atividade dos músculos articulatórios, sem a necessidade de falar em voz alta. Assim, a combinação de ambas as técnicas criam uma interação totalmente de mãos-livres com a plataforma, melhorando as comunicações do dia-a-dia de pessoas com incapacidades nas mãos/braços. Além disso, os utilizadores serão inseridos num sistema descentralizado, desenhado para comunicações e trocas de dados seguras entre pares, reforçando, assim, a preocupação com a privacidade, que é um conceito base da plataforma

Repositório Científico do Instituto Politécnico do Porto

Wavelet-based Processing of Electroencephalographic and Electromyographic Signals for Speech Recognition (Studienarbeit)

Author: Wand Michael
Publication venue
Publication date: 04/08/2008
Field of study

KITopen

Deep learning for healthcare applications based on physiological signals: A review

Author: Acharya
Acharya
Acharya
Acharya
Acharya
Acharya
Acharya
Acharya
Acharya
Acharya
Acharya
Ahmed
Allard
An
Anton
Aschenbrenner-Scheibe
Atzori
Atzori
Atzori
Barea
Basmajian
Bengio
Bengio
Bergstra
Blankertz
Brunner
Bulling
Bunce
Cecotti
Chen
Chen
Cheng
De Chazal
De Luca
De Luca
Dean
Devasahayam
Du
Erhan
Faust
Faust
Faust
Faust
Faust
Faust
Faust
Faust
Faust
Faust
Faust
Fraiwan
Fukushima
Geng
Goldberger
Goodfellow
Greenwald
Guger
Gödel
Hajinoroozi
Hajinoroozi
He
Hinton
Hinton
Hochreiter
Hopfield
Hosseini
Huang
Huve
Jeffries
Jenny
Jia
Jingwei
Jirayucharoensak
Jung
Kalayci
Kantz
Kemp
Kherlopian
Kingma
Kiral-Kornek
Kiranyaz
Koelstra
Kohavi
Krupinski
Kutner
Larochelle
LeCun
LeCun
LeCun
Lee
Leeb
Li
Liu
Liu
Lu
Luo
Längkvist
Längkvist
Maiwald
Majumdar
Miller
Min
Miotto
Mirowski
Moody
Moody
Moretti
Morrell
Muduli
Najafabadi
Nolle
Nurse
Oh Shu Lih
Oliver Faust
Park
Penzel
Piroska
Piryatinska
Pourbabaee
Rao
Ren
Sajda
Salakhutdinov
Schaffer
Schelter
Schelter
Schirrmeister
Schirrmeister
Schlögl
Shashikumar
Smolensky
Spampinato
Squire
Stead
Sörnmo
Taji
Tan
Tan Jen Hong
Tokui
U Rajendra Acharya
Van Drongelen
van Eck
van Putten
Vertinsky
Waller
Waltman
Wand
Wand
Wand
Wang
Winterhalder
Wolpaw
Xia
Xia
Xing
Yoon
Yuki Hagiwara
Zadrozny
Zhai
Zhang
Zheng
Zheng
Zheng
Zhi
Zhu
Publication venue: 'Elsevier BV'
Publication date: 01/07/2018
Field of study

Background and objective: We have cast the net into the ocean of knowledge to retrieve the latest scientific research on deep learning methods for physiological signals. We found 53 research papers on this topic, published from 01.01.2008 to 31.12.2017. Methods: An initial bibliometric analysis shows that the reviewed papers focused on Electromyogram(EMG), Electroencephalogram(EEG), Electrocardiogram(ECG), and Electrooculogram(EOG). These four categories were used to structure the subsequent content review. Results: During the content review, we understood that deep learning performs better for big and varied datasets than classic analysis and machine classification methods. Deep learning algorithms try to develop the model by using all the available input. Conclusions: This review paper depicts the application of various deep learning algorithms used till recently, but in future it will be used for more healthcare areas to improve the quality of diagnosi

Crossref

Sheffield Hallam University Research Archive

Silent Speech Interfaces for Speech Restoration: A Review

Author: González López José Andrés
Gómez Alanís Alejandro
Gómez Ángel M.
Martín Doñas Juan M.
Pérez Córdoba José Luis
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/09/2020
Field of study

This work was supported in part by the Agencia Estatal de Investigacion (AEI) under Grant PID2019-108040RB-C22/AEI/10.13039/501100011033. The work of Jose A. Gonzalez-Lopez was supported in part by the Spanish Ministry of Science, Innovation and Universities under Juan de la Cierva-Incorporation Fellowship (IJCI-2017-32926).This review summarises the status of silent speech interface (SSI) research. SSIs rely on non-acoustic biosignals generated by the human body during speech production to enable communication whenever normal verbal communication is not possible or not desirable. In this review, we focus on the first case and present latest SSI research aimed at providing new alternative and augmentative communication methods for persons with severe speech disorders. SSIs can employ a variety of biosignals to enable silent communication, such as electrophysiological recordings of neural activity, electromyographic (EMG) recordings of vocal tract movements or the direct tracking of articulator movements using imaging techniques. Depending on the disorder, some sensing techniques may be better suited than others to capture speech-related information. For instance, EMG and imaging techniques are well suited for laryngectomised patients, whose vocal tract remains almost intact but are unable to speak after the removal of the vocal folds, but fail for severely paralysed individuals. From the biosignals, SSIs decode the intended message, using automatic speech recognition or speech synthesis algorithms. Despite considerable advances in recent years, most present-day SSIs have only been validated in laboratory settings for healthy users. Thus, as discussed in this paper, a number of challenges remain to be addressed in future research before SSIs can be promoted to real-world applications. If these issues can be addressed successfully, future SSIs will improve the lives of persons with severe speech impairments by restoring their communication capabilities.Agencia Estatal de Investigacion (AEI) PID2019-108040RB-C22/AEI/10.13039/501100011033Spanish Ministry of Science, Innovation and Universities under Juan de la Cierva-Incorporation Fellowship IJCI-2017-3292

arXiv.org e-Print Archive

Repositorio Institucional Universidad de Granada

Speech Recognition using Surface Electromyography

Author: Maier-Hein Lena
Publication venue
Publication date: 04/08/2008
Field of study

KITopen

Interfaces de fala silenciosa multimodais para português europeu com base na articulação

Author: Freitas João Dinis Colaço de
Publication venue: Universidade de Aveiro
Publication date: 01/01/2015
Field of study

Doutoramento conjunto MAPi em InformáticaThe concept of silent speech, when applied to Human-Computer Interaction (HCI), describes a system which allows for speech communication in the absence of an acoustic signal. By analyzing data gathered during different parts of the human speech production process, Silent Speech Interfaces (SSI) allow users with speech impairments to communicate with a system. SSI can also be used in the presence of environmental noise, and in situations in which privacy, confidentiality, or non-disturbance are important. Nonetheless, despite recent advances, performance and usability of Silent Speech systems still have much room for improvement. A better performance of such systems would enable their application in relevant areas, such as Ambient Assisted Living. Therefore, it is necessary to extend our understanding of the capabilities and limitations of silent speech modalities and to enhance their joint exploration. Thus, in this thesis, we have established several goals: (1) SSI language expansion to support European Portuguese; (2) overcome identified limitations of current SSI techniques to detect EP nasality (3) develop a Multimodal HCI approach for SSI based on non-invasive modalities; and (4) explore more direct measures in the Multimodal SSI for EP acquired from more invasive/obtrusive modalities, to be used as ground truth in articulation processes, enhancing our comprehension of other modalities. In order to achieve these goals and to support our research in this area, we have created a multimodal SSI framework that fosters leveraging modalities and combining information, supporting research in multimodal SSI. The proposed framework goes beyond the data acquisition process itself, including methods for online and offline synchronization, multimodal data processing, feature extraction, feature selection, analysis, classification and prototyping. Examples of applicability are provided for each stage of the framework. These include articulatory studies for HCI, the development of a multimodal SSI based on less invasive modalities and the use of ground truth information coming from more invasive/obtrusive modalities to overcome the limitations of other modalities. In the work here presented, we also apply existing methods in the area of SSI to EP for the first time, noting that nasal sounds may cause an inferior performance in some modalities. In this context, we propose a non-invasive solution for the detection of nasality based on a single Surface Electromyography sensor, conceivable of being included in a multimodal SSI.O conceito de fala silenciosa, quando aplicado a interação humano-computador, permite a comunicação na ausência de um sinal acústico. Através da análise de dados, recolhidos no processo de produção de fala humana, uma interface de fala silenciosa (referida como SSI, do inglês Silent Speech Interface) permite a utilizadores com deficiências ao nível da fala comunicar com um sistema. As SSI podem também ser usadas na presença de ruído ambiente, e em situações em que privacidade, confidencialidade, ou não perturbar, é importante. Contudo, apesar da evolução verificada recentemente, o desempenho e usabilidade de sistemas de fala silenciosa tem ainda uma grande margem de progressão. O aumento de desempenho destes sistemas possibilitaria assim a sua aplicação a áreas como Ambientes Assistidos. É desta forma fundamental alargar o nosso conhecimento sobre as capacidades e limitações das modalidades utilizadas para fala silenciosa e fomentar a sua exploração conjunta. Assim, foram estabelecidos vários objetivos para esta tese: (1) Expansão das linguagens suportadas por SSI com o Português Europeu; (2) Superar as limitações de técnicas de SSI atuais na deteção de nasalidade; (3) Desenvolver uma abordagem SSI multimodal para interação humano-computador, com base em modalidades não invasivas; (4) Explorar o uso de medidas diretas e complementares, adquiridas através de modalidades mais invasivas/intrusivas em configurações multimodais, que fornecem informação exata da articulação e permitem aumentar a nosso entendimento de outras modalidades. Para atingir os objetivos supramencionados e suportar a investigação nesta área procedeu-se à criação de uma plataforma SSI multimodal que potencia os meios para a exploração conjunta de modalidades. A plataforma proposta vai muito para além da simples aquisição de dados, incluindo também métodos para sincronização de modalidades, processamento de dados multimodais, extração e seleção de características, análise, classificação e prototipagem. Exemplos de aplicação para cada fase da plataforma incluem: estudos articulatórios para interação humano-computador, desenvolvimento de uma SSI multimodal com base em modalidades não invasivas, e o uso de informação exata com origem em modalidades invasivas/intrusivas para superar limitações de outras modalidades. No trabalho apresentado aplica-se ainda, pela primeira vez, métodos retirados do estado da arte ao Português Europeu, verificando-se que sons nasais podem causar um desempenho inferior de um sistema de fala silenciosa. Neste contexto, é proposta uma solução para a deteção de vogais nasais baseada num único sensor de eletromiografia, passível de ser integrada numa interface de fala silenciosa multimodal

Repositório Institucional da Universidade de Aveiro