Search CORE

124,865 research outputs found

Sign language recognition with transformer networks

Author: Dambre Joni
De Coster Mathieu
Van Herreweghe Mieke
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/01/2020
Field of study

Sign languages are complex languages. Research into them is ongoing, supported by large video corpora of which only small parts are annotated. Sign language recognition can be used to speed up the annotation process of these corpora, in order to aid research into sign languages and sign language recognition. Previous research has approached sign language recognition in various ways, using feature extraction techniques or end-to-end deep learning. In this work, we apply a combination of feature extraction using OpenPose for human keypoint estimation and end-to-end feature learning with Convolutional Neural Networks. The proven multi-head attention mechanism used in transformers is applied to recognize isolated signs in the Flemish Sign Language corpus. Our proposed method significantly outperforms the previous state of the art of sign language recognition on the Flemish Sign Language corpus: we obtain an accuracy of 74.7% on a vocabulary of 100 classes. Our results will be implemented as a suggestion system for sign language corpus annotation

Ghent University Academic Bibliography

DeepASL: Enabling Ubiquitous and Non-Intrusive Word and Sentence-Level Sign Language Translation

Author: Chai Xiujuan
Du Yong
Graves Alex
Kanwal Kehkashan
Praveen Nikhita
Socher Richard
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 17/10/2018
Field of study

There is an undeniable communication barrier between deaf people and people with normal hearing ability. Although innovations in sign language translation technology aim to tear down this communication barrier, the majority of existing sign language translation systems are either intrusive or constrained by resolution or ambient lighting conditions. Moreover, these existing systems can only perform single-sign ASL translation rather than sentence-level translation, making them much less useful in daily-life communication scenarios. In this work, we fill this critical gap by presenting DeepASL, a transformative deep learning-based sign language translation technology that enables ubiquitous and non-intrusive American Sign Language (ASL) translation at both word and sentence levels. DeepASL uses infrared light as its sensing mechanism to non-intrusively capture the ASL signs. It incorporates a novel hierarchical bidirectional deep recurrent neural network (HB-RNN) and a probabilistic framework based on Connectionist Temporal Classification (CTC) for word-level and sentence-level ASL translation respectively. To evaluate its performance, we have collected 7,306 samples from 11 participants, covering 56 commonly used ASL words and 100 ASL sentences. DeepASL achieves an average 94.5% word-level translation accuracy and an average 8.2% word error rate on translating unseen ASL sentences. Given its promising performance, we believe DeepASL represents a significant step towards breaking the communication barrier between deaf people and hearing majority, and thus has the significant potential to fundamentally change deaf people's lives

arXiv.org e-Print Archive

Crossref

Sign Language Fingerspelling Classification from Depth and Color Images using a Deep Belief Network

Author: Giguère Philippe
Rioux-Maldague Lucas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/03/2015
Field of study

Automatic sign language recognition is an open problem that has received a lot of attention recently, not only because of its usefulness to signers, but also due to the numerous applications a sign classifier can have. In this article, we present a new feature extraction technique for hand pose recognition using depth and intensity images captured from a Microsoft Kinect sensor. We applied our technique to American Sign Language fingerspelling classification using a Deep Belief Network, for which our feature extraction technique is tailored. We evaluated our results on a multi-user data set with two scenarios: one with all known users and one with an unseen user. We achieved 99% recall and precision on the first, and 77% recall and 79% precision on the second. Our method is also capable of real-time sign classification and is adaptive to any environment or lightning intensity.Comment: Published in 2014 Canadian Conference on Computer and Robot Visio

arXiv.org e-Print Archive

Crossref

Embodied & Situated Language Processing

Author: Coventry Kenny
Engelhardt Paul
Taylor Lawrence
Publication venue
Publication date: 01/08/2012
Field of study

Northumbria Research Link