Search CORE

1,121 research outputs found

Recognition of Sign Language from High Resolution Images Using Adaptive Feature Extraction and Classification

Author: Csóka Filip
Csóka Tibor
Kačur Juraj
Polec Jaroslav
Publication venue: Electronics and Telecommunications Committee
Publication date: 01/01/2019
Field of study

A variety of algorithms allows gesture recognition in video sequences. Alleviating the need for interpreters is of interest to hearing impaired people, since it allows a great degree of self-sufficiency in communicating their intent to the non-sign language speakers without the need for interpreters. State-of-the-art in currently used algorithms in this domain is capable of either real-time recognition of sign language in low resolution videos or non-real-time recognition in high-resolution videos. This paper proposes a novel approach to real-time recognition of fingerspelling alphabet letters of American Sign Language (ASL) in ultra-high-resolution (UHD) video sequences. The proposed approach is based on adaptive Laplacian of Gaussian (LoG) filtering with local extrema detection using Features from Accelerated Segment Test (FAST) algorithm classified by a Convolutional Neural Network (CNN). The recognition rate of our algorithm was verified on real-life data

Biblioteka Nauki - repozytorium artykuÅÃ³w

International Journal of Electronics and Telecommunications (Warsaw University of Technology)

Automatic recognition of fingerspelled words in British Sign Language

Author: Everingham M.
Liwicki S.
Publication venue
Publication date: 01/01/2009
Field of study

We investigate the problem of recognizing words from video, fingerspelled using the British Sign Language (BSL) fingerspelling alphabet. This is a challenging task since the BSL alphabet involves both hands occluding each other, and contains signs which are ambiguous from the observer’s viewpoint. The main contributions of our work include: (i) recognition based on hand shape alone, not requiring motion cues; (ii) robust visual features for hand shape recognition; (iii) scalability to large lexicon recognition with no re-training. We report results on a dataset of 1,000 low quality webcam videos of 100 words. The proposed method achieves a word recognition accuracy of 98.9%

CiteSeerX

Crossref

White Rose Research Online

Detection of major ASL sign types in continuous signing for ASL recognition

Author: Metaxas Dimitris
Neidle Carol
Yanovich Polina
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/01/2016
Field of study

In American Sign Language (ASL) as well as other signed languages, different classes of signs (e.g., lexical signs, fingerspelled signs, and classifier constructions) have different internal structural properties. Continuous sign recognition accuracy can be improved through use of distinct recognition strategies, as well as different training datasets, for each class of signs. For these strategies to be applied, continuous signing video needs to be segmented into parts corresponding to particular classes of signs. In this paper we present a multiple instance learning-based segmentation system that accurately labels 91.27% of the video frames of 500 continuous utterances (including 7 different subjects) from the publicly accessible NCSLGR corpus (Neidle and Vogler, 2012). The system uses novel feature descriptors derived from both motion and shape statistics of the regions of high local motion. The system does not require a hand tracker

Boston University Institutional Repository (OpenBU)

Advanced Capsule Networks via Context Awareness

Author: Phong Nguyen Huu
Ribeiro Bernardete
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/04/2019
Field of study

Capsule Networks (CN) offer new architectures for Deep Learning (DL) community. Though its effectiveness has been demonstrated in MNIST and smallNORB datasets, the networks still face challenges in other datasets for images with distinct contexts. In this research, we improve the design of CN (Vector version) namely we expand more Pooling layers to filter image backgrounds and increase Reconstruction layers to make better image restoration. Additionally, we perform experiments to compare accuracy and speed of CN versus DL models. In DL models, we utilize Inception V3 and DenseNet V201 for powerful computers besides NASNet, MobileNet V1 and MobileNet V2 for small and embedded devices. We evaluate our models on a fingerspelling alphabet dataset from American Sign Language (ASL). The results show that CNs perform comparably to DL models while dramatically reducing training time. We also make a demonstration and give a link for the purpose of illustration.Comment: 12 page

arXiv.org e-Print Archive

Crossref