77 research outputs found

    Automatic recognition of fingerspelled words in British Sign Language

    Get PDF
    We investigate the problem of recognizing words from video, fingerspelled using the British Sign Language (BSL) fingerspelling alphabet. This is a challenging task since the BSL alphabet involves both hands occluding each other, and contains signs which are ambiguous from the observer’s viewpoint. The main contributions of our work include: (i) recognition based on hand shape alone, not requiring motion cues; (ii) robust visual features for hand shape recognition; (iii) scalability to large lexicon recognition with no re-training. We report results on a dataset of 1,000 low quality webcam videos of 100 words. The proposed method achieves a word recognition accuracy of 98.9%

    Detection of major ASL sign types in continuous signing for ASL recognition

    Get PDF
    In American Sign Language (ASL) as well as other signed languages, different classes of signs (e.g., lexical signs, fingerspelled signs, and classifier constructions) have different internal structural properties. Continuous sign recognition accuracy can be improved through use of distinct recognition strategies, as well as different training datasets, for each class of signs. For these strategies to be applied, continuous signing video needs to be segmented into parts corresponding to particular classes of signs. In this paper we present a multiple instance learning-based segmentation system that accurately labels 91.27% of the video frames of 500 continuous utterances (including 7 different subjects) from the publicly accessible NCSLGR corpus (Neidle and Vogler, 2012). The system uses novel feature descriptors derived from both motion and shape statistics of the regions of high local motion. The system does not require a hand tracker

    Phonological and orthographic processing in deaf readers during recognition of written and fingerspelled words in Spanish and English

    Get PDF
    The role of phonological and orthographic access during word recognition, as well as its developmental trajectory in deaf readers is still a matter of debate. This thesis examined how phonological and orthographic information is used during written and fingerspelled word recognition by three groups of deaf readers: 1) adult readers of English, 2) adult and 3) young readers of Spanish. I also investigated whether the size of the orthographic and phonological effects was related to reading skill and other related variables: vocabulary, phonological awareness, speechreading and fingerspelling abilities. A sandwich masked priming paradigm was used to assess automatic phonological (pseudohomophone priming; Experiments 1-3) and orthographic (transposed-letter priming; Experiments 4–6) effects in all groups during recognition of single written words. To examine fingerspelling processing, pseudohomophone (Experiments 7–9) and transposed-letter (Experiments 10-12) effects were examined in lexical decision tasks with fingerspelled video stimuli. Phonological priming effects were found for adult deaf readers of English. Interestingly, for deaf readers of Spanish only those young readers with a small vocabulary size showed phonological priming. Conversely, orthographic masked priming was found in adult deaf readers of English and Spanish as well as young deaf readers with large vocabulary size. Reading ability was only correlated to the orthographic priming effect (in accuracy) in the adult deaf readers of English. Fingerspelled pseudohomophones took longer than control pseudowords to reject as words in the adult deaf readers of English and in the young deaf readers of Spanish with a small vocabulary, suggesting sensitivity to speech phonology in these groups. The findings suggest greater reliance on phonology by less skilled deaf readers of both Spanish and English. Additionally, they suggest greater reliance on phonology during both word and fingerspelling processing in deaf readers of a language with a deeper orthography (English), than by expert readers of a shallow orthography (Spanish)

    SMILE Swiss German Sign Language Dataset

    Full text link

    ASL video Corpora & Sign Bank: resources available through the American Sign Language Linguistic Research Project (ASLLRP)

    Get PDF
    The American Sign Language Linguistic Research Project (ASLLRP) provides Internet access to high-quality ASL video data, generally including front and side views and a close-up of the face. The manual and non-manual components of the signing have been linguistically annotated using SignStream®. The recently expanded video corpora can be browsed and searched through the Data Access Interface (DAI 2) we have designed; it is possible to carry out complex searches. The data from our corpora can also be downloaded; annotations are available in an XML export format. We have also developed the ASLLRP Sign Bank, which contains almost 6,000 sign entries for lexical signs, with distinct English-based glosses, with a total of 41,830 examples of lexical signs (in addition to about 300 gestures, over 1,000 fingerspelled signs, and 475 classifier examples). The Sign Bank is likewise accessible and searchable on the Internet; it can also be accessed from within SignStream® (software to facilitate linguistic annotation and analysis of visual language data) to make annotations more accurate and efficient. Here we describe the available resources. These data have been used for many types of research in linguistics and in computer-based sign language recognition from video; examples of such research are provided in the latter part of this article.Published versio

    Learning outcomes for American Sign Language skill levels 1-4

    Get PDF
    This document describes measurable learning outcomes for American Sign Language (ASL) levels 1 – 4. A history of ASL provides the background and foundation for the document and includes an overview of teaching and learning ASL in the United States. The processes leading to the creation of the outcomes for ASL levels 1 – 4 are discussed and incorporate the development of ASL outcomes for college-level courses. Information about how the outcomes were adapted was taken, with permission, from the American Council on the Teaching of Foreign Languages (ACTFL). The key premise of ACTFL’s “5 Cs” are: Communication, Cultures, Connections, Comparisons, and Communities and are highlighted in the document. Recommendations by the American Sign Language Teachers Association (ASLTA) and stakeholders in New York State are included, along with the number and content of instructional contact hours in a supervised language laboratory. The measurable learning outcomes following ACTFL’s 5 Cs make up the majority of the document. Regardless of teaching style, and acknowledging that each teacher is unique and has his or her own teaching style, the goals and objectives for measuring student progress must be met. References, a resource section, and a reading section are included, as well as appendices with a glossary and information pertaining to ASL performance interviews

    Efficient Kinect Sensor-based Kurdish Sign Language Recognition Using Echo System Network

    Get PDF
    Sign language assists in building communication and bridging gaps in understanding. Automatic sign language recognition (ASLR) is a field that has recently been studied for various sign languages. However, Kurdish sign language (KuSL) is relatively new and therefore researches and designed datasets on it are limited. This paper has proposed a model to translate KuSL into text and has designed a dataset using Kinect V2 sensor. The computation complexity of feature extraction and classification steps, which are serious problems for ASLR, has been investigated in this paper. The paper proposed a feature engineering approach on the skeleton position alone to provide a better representation of the features and avoid the use of all of the image information. In addition, the paper proposed model makes use of recurrent neural networks (RNNs)-based models. Training RNNs is inherently difficult, and consequently, motivates to investigate alternatives. Besides the trainable long short-term memory (LSTM), this study has proposed the untrained low complexity echo system network (ESN) classifier. The accuracy of both LSTM and ESN indicates they can outperform those in state-of-the-art studies. In addition, ESN which has not been proposed thus far for ASLT exhibits comparable accuracy to the LSTM with a significantly lower training time

    MirrorGen Wearable Gesture Recognition using Synthetic Videos

    Get PDF
    abstract: In recent years, deep learning systems have outperformed traditional machine learning systems in most domains. There has been a lot of research recently in the field of hand gesture recognition using wearable sensors due to the numerous advantages these systems have over vision-based ones. However, due to the lack of extensive datasets and the nature of the Inertial Measurement Unit (IMU) data, there are difficulties in applying deep learning techniques to them. Although many machine learning models have good accuracy, most of them assume that training data is available for every user while other works that do not require user data have lower accuracies. MirrorGen is a technique which uses wearable sensor data and generates synthetic videos using hand movements and it mitigates the traditional challenges of vision based recognition such as occlusion, lighting restrictions, lack of viewpoint variations, and environmental noise. In addition, MirrorGen allows for user-independent recognition involving minimal human effort during data collection. It also helps leverage the advances in vision-based recognition by using various techniques like optical flow extraction, 3D convolution. Projecting the orientation (IMU) information to a video helps in gaining position information of the hands. To validate these claims, we perform entropy analysis on various configurations such as raw data, stick model, hand model and real video. Human hand model is found to have an optimal entropy that helps in achieving user independent recognition. It also serves as a pervasive option as opposed to a video-based recognition. The average user independent recognition accuracy of 99.03% was achieved for a sign language dataset with 59 different users, 20 different signs with 20 repetitions each for a total of 23k training instances. Moreover, synthetic videos can be used to augment real videos to improve recognition accuracy.Dissertation/ThesisMasters Thesis Computer Science 201
    • …
    corecore