777 research outputs found
Sign Language Fingerspelling Classification from Depth and Color Images using a Deep Belief Network
Automatic sign language recognition is an open problem that has received a
lot of attention recently, not only because of its usefulness to signers, but
also due to the numerous applications a sign classifier can have. In this
article, we present a new feature extraction technique for hand pose
recognition using depth and intensity images captured from a Microsoft Kinect
sensor. We applied our technique to American Sign Language fingerspelling
classification using a Deep Belief Network, for which our feature extraction
technique is tailored. We evaluated our results on a multi-user data set with
two scenarios: one with all known users and one with an unseen user. We
achieved 99% recall and precision on the first, and 77% recall and 79%
precision on the second. Our method is also capable of real-time sign
classification and is adaptive to any environment or lightning intensity.Comment: Published in 2014 Canadian Conference on Computer and Robot Visio
Detection of major ASL sign types in continuous signing for ASL recognition
In American Sign Language (ASL) as well as other signed languages, different classes of signs (e.g., lexical signs, fingerspelled signs, and classifier constructions) have different internal structural properties. Continuous sign recognition accuracy can be improved through use of distinct recognition strategies, as well as different training datasets, for each class of signs. For these strategies to be applied, continuous signing video needs to be segmented into parts corresponding to particular classes of signs. In this paper we present a multiple instance learning-based segmentation system that accurately labels 91.27% of the video frames of 500 continuous utterances (including 7 different subjects) from the publicly accessible NCSLGR corpus (Neidle and Vogler, 2012). The system uses novel feature descriptors derived from both motion and shape statistics of the regions of high local motion. The system does not require a hand tracker
Multi-Modal Deep Hand Sign Language Recognition in Still Images Using Restricted Boltzmann Machine
In this paper, a deep learning approach, Restricted Boltzmann Machine (RBM), is used to perform automatic hand sign language recognition from visual data. We evaluate how RBM, as a deep generative model, is capable of generating the distribution of the input data for an enhanced recognition of unseen data. Two modalities, RGB and Depth, are considered in the model input in three forms: original image, cropped image, and noisy cropped image. Five crops of the input image are used and the hand of these cropped images are detected using Convolutional Neural Network (CNN). After that, three types of the detected hand images are generated for each modality and input to RBMs. The outputs of the RBMs for two modalities are fused in another RBM in order to recognize the output sign label of the input image. The proposed multi-modal model is trained on all and part of the American alphabet and digits of four publicly available datasets. We also evaluate the robustness of the proposal against noise. Experimental results show that the proposed multi-modal model, using crops and the RBM fusing methodology, achieves state-of-the-art results on Massey University Gesture Dataset 2012, American Sign Language (ASL). and Fingerspelling Dataset from the University of Surrey's Center for Vision, Speech and Signal Processing, NYU, and ASL Fingerspelling A datasets
Active Learning for Multilingual Fingerspelling Corpora
We apply active learning to help with data scarcity problems in sign
languages. In particular, we perform a novel analysis of the effect of
pre-training. Since many sign languages are linguistic descendants of French
sign language, they share hand configurations, which pre-training can hopefully
exploit. We test this hypothesis on American, Chinese, German, and Irish
fingerspelling corpora. We do observe a benefit from pre-training, but this may
be due to visual rather than linguistic similaritie
An AI-Based Framework for Translating American Sign Language to English and Vice Versa
Abstract: In this paper, we propose a framework to convert American Sign Language (ASL) to English and English to ASL. Within this framework, we use a deep learning model along with the rolling average prediction that captures image frames from videos and classifies the signs from the image frames. The classified frames are then used to construct ASL words and sentences to support people with hearing impairments. We also use the same deep learning model to capture signs from the people with deaf symptoms and convert them into ASL words and English sentences. Based on this framework, we developed a web-based tool to use in real-life application and we also present the tool as a proof of concept. With the evaluation, we found that the deep learning model converts the image signs into ASL words and sentences with high accuracy. The tool was also found to be very useful for people with hearing impairment and deaf symptoms. The main contribution of this work is the design of a system to convert ASL to English and vice versa
- …