1,121 research outputs found
Recognition of Sign Language from High Resolution Images Using Adaptive Feature Extraction and Classification
A variety of algorithms allows gesture recognition in video sequences. Alleviating the need for interpreters is of interest to hearing impaired people, since it allows a great degree of self-sufficiency in communicating their intent to the non-sign language speakers without the need for interpreters. State-of-the-art in currently used algorithms in this domain is capable of either real-time recognition of sign language in low resolution videos or non-real-time recognition in high-resolution videos. This paper proposes a novel approach to real-time recognition of fingerspelling alphabet letters of American Sign Language (ASL) in ultra-high-resolution (UHD) video sequences. The proposed approach is based on adaptive Laplacian of Gaussian (LoG) filtering with local extrema detection using Features from Accelerated Segment Test (FAST) algorithm classified by a Convolutional Neural Network (CNN). The recognition rate of our algorithm was verified on real-life data
Automatic recognition of fingerspelled words in British Sign Language
We investigate the problem of recognizing words from
video, fingerspelled using the British Sign Language (BSL)
fingerspelling alphabet. This is a challenging task since the
BSL alphabet involves both hands occluding each other, and
contains signs which are ambiguous from the observer’s
viewpoint. The main contributions of our work include:
(i) recognition based on hand shape alone, not requiring
motion cues; (ii) robust visual features for hand shape
recognition; (iii) scalability to large lexicon recognition
with no re-training.
We report results on a dataset of 1,000 low quality webcam
videos of 100 words. The proposed method achieves a
word recognition accuracy of 98.9%
Detection of major ASL sign types in continuous signing for ASL recognition
In American Sign Language (ASL) as well as other signed languages, different classes of signs (e.g., lexical signs, fingerspelled signs, and classifier constructions) have different internal structural properties. Continuous sign recognition accuracy can be improved through use of distinct recognition strategies, as well as different training datasets, for each class of signs. For these strategies to be applied, continuous signing video needs to be segmented into parts corresponding to particular classes of signs. In this paper we present a multiple instance learning-based segmentation system that accurately labels 91.27% of the video frames of 500 continuous utterances (including 7 different subjects) from the publicly accessible NCSLGR corpus (Neidle and Vogler, 2012). The system uses novel feature descriptors derived from both motion and shape statistics of the regions of high local motion. The system does not require a hand tracker
Advanced Capsule Networks via Context Awareness
Capsule Networks (CN) offer new architectures for Deep Learning (DL)
community. Though its effectiveness has been demonstrated in MNIST and
smallNORB datasets, the networks still face challenges in other datasets for
images with distinct contexts. In this research, we improve the design of CN
(Vector version) namely we expand more Pooling layers to filter image
backgrounds and increase Reconstruction layers to make better image
restoration. Additionally, we perform experiments to compare accuracy and speed
of CN versus DL models. In DL models, we utilize Inception V3 and DenseNet V201
for powerful computers besides NASNet, MobileNet V1 and MobileNet V2 for small
and embedded devices. We evaluate our models on a fingerspelling alphabet
dataset from American Sign Language (ASL). The results show that CNs perform
comparably to DL models while dramatically reducing training time. We also make
a demonstration and give a link for the purpose of illustration.Comment: 12 page
- …