Search CORE

114 research outputs found

Cued Speech Automatic Recognition in Normal Hearing and Deaf Subjects

Author: Aboutabit Noureddine
Beautemps Denis
Heracleous Panikos
Publication venue: 'Elsevier BV'
Publication date: 01/01/2010
Field of study

International audienceThis article discusses the automatic recognition of Cued Speech in French based on hidden Markov models (HMMs)

Hal - Université Grenoble Alpes

HAL Descartes

Cued Speech: A visual communication mode for the Deaf society

Author: Beautemps Denis
Heracleous Panikos
Publication venue: 'National Institute of Informatics (NII)'
Publication date: 01/01/2010
Field of study

International audienceCued Speech is a visual mode of communication that uses handshapes and placements in combination with the mouth movements of speech to make the phonemes of a spoken language look different from each other and clearly understandable to deaf individuals. The aim of Cued Speech is to overcome the problems of lip reading and thus enable deaf persons to wholly understand spoken language. In this study, automatic phoneme recognition in Cued Speech for French based on hidden Markov model (HMMs) is introduced. The phoneme correct for a normal-hearing cuer was 82.9%, and for a deaf 81.5%. The results also showed, that creating cuer-independent HMMs should not face any specific difficulties, other than those occured in audio speech recognition

Hal - Université Grenoble Alpes

HAL Descartes

Hal-Diderot

Towards Augmentative Speech Communication

Author: Denis Beautemps
Hiroshi Ishiguro
Norihiro Hagita
Panikos Heracleous
Publication venue: 'IntechOpen'
Publication date: 21/06/2011
Field of study

IntechOpen

A New Re-synchronization Method based Multi-modal Fusion for Automatic Continuous Cued Speech Recognition

Author: Beautemps Denis
Feng Gang
Liu Li
Zhang Xiao-Ping
Publication venue: HAL CCSD
Publication date: 09/01/2020
Field of study

Cued Speech (CS) is an augmented lip reading complemented by hand coding, and it is very helpful to the deaf people. Automatic CS recognition can help communications between the deaf people and others. Due to the asynchronous nature of lips and hand movements, fusion of them in automatic CS recognition is a challenging problem. In this work, we propose a novel re-synchronization procedure for multi-modal fusion, which aligns the hand features with lips feature. It is realized by delaying hand position and hand shape with their optimal hand preceding time which is derived by investigating the temporal organizations of hand position and hand shape movements in CS. This re-synchronization procedure is incorporated into a practical continuous CS recognition system that combines convolutional neural network (CNN) with multi-stream hidden markov model (MSHMM). A significant improvement of about 4.6% has been achieved retaining 76.6% CS phoneme recognition correctness compared with the state-of-the-art architecture (72.04%), which did not take into account the asynchrony issue of multi-modal fusion in CS. To our knowledge, this is the first work to tackle the asynchronous multi-modal fusion in the automatic continuous CS recognition

Multimodal Based Audio-Visual Speech Recognition for Hard-of-Hearing: State of the Art Techniques and Challenges

Author: Bhaskar Shabina
M Thasleema T
Publication venue: IAES Indonesia Section
Publication date: 31/05/2022
Field of study

Multimodal Integration (MI) is the study of merging the knowledge acquired by the nervous system using sensory modalities such as speech, vision, touch, and gesture. The applications of MI expand over the areas of Audio-Visual Speech Recognition (AVSR), Sign Language Recognition (SLR), Emotion Recognition (ER), Bio Metrics Applications (BMA), Affect Recognition (AR), Multimedia Retrieval (MR), etc. The fusion of modalities such as hand gestures- facial, lip- hand position, etc., are mainly used sensory modalities for the development of hearing-impaired multimodal systems. This paper encapsulates an overview of multimodal systems available within literature towards hearing impaired studies. This paper also discusses some of the studies related to hearing-impaired acoustic analysis. It is observed that very less algorithms have been developed for hearing impaired AVSR as compared to normal hearing. Thus, the study of audio-visual based speech recognition systems for the hearing impaired is highly demanded for the people who are trying to communicate with natively speaking languages. This paper also highlights the state-of-the-art techniques in AVSR and the challenges faced by the researchers for the development of AVSR systems

Indonesian Journal of Electrical Engineering and Informatics (IJEEI)

Augmented Reality Talking Heads as a Support for Speech Perception and Production

Author: Olov Engwall
Publication venue: 'IntechOpen'
Publication date: 09/12/2011
Field of study

IntechOpen

Shared-hidden-layer Deep Neural Network for Under-resourced Language the Content

Author: Hoesen Devin
Lestari Dessi Puji
Widyantoro Dwi Hendratmo
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 01/06/2018
Field of study

Training speech recognizer with under-resourced language data still proves difficult. Indonesian language is considered under-resourced because the lack of a standard speech corpus, text corpus, and dictionary. In this research, the efficacy of augmenting limited Indonesian speech training data with highly-resourced-language training data, such as English, to train Indonesian speech recognizer was analyzed. The training was performed in form of shared-hidden-layer deep-neural-network (SHL-DNN) training. An SHL-DNN has language-independent hidden layers and can be pre-trained and trained using multilingual training data without any difference with a monolingual deep neural network. The SHL-DNN using Indonesian and English speech training data proved effective for decreasing word error rate (WER) in decoding Indonesian dictated-speech by achieving 3.82% absolute decrease compared to a monolingual Indonesian hidden Markov model using Gaussian mixture model emission (GMM-HMM). The case was confirmed when the SHL-DNN was also employed to decode Indonesian spontaneous-speech by achieving 4.19% absolute WER decrease

Journal of Education and Learning (EduLearn)

TELKOMNIKA (Telecommunication Computing Electronics and Control)

UAD Journal Management System

Cued Speech Gesture Recognition: A First Prototype Based on Early Reduction

Author: Burger Thomas
Caplier Alice
Perret Pascal
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2007
Field of study

International audienceCued Speech is a specific linguistic code for hearing-impaired people. It is based on both lip reading and manual gestures. In the context of THIMP (Telephony for the Hearing-IMpaired Project), we work on automatic cued speech translation. In this paper, we only address the problem of automatic cued speech manual gesture recognition. Such a gesture recognition issue is really common from a theoretical point of view, but we approach it with respect to its particularities in order to derive an original method. This method is essentially built around a bioinspired method called early reduction. Prior to a complete analysis of each image of a sequence, the early reduction process automatically extracts a restricted number of key images which summarize the whole sequence. Only the key images are studied from a temporal point of view with lighter computation than the complete sequenc

Crossref

Hal - Université Grenoble Alpes

Springer - Publisher Connector

Directory of Open Access Journals

HAL-Université de Bretagne Occidentale