378 research outputs found

    Linguistically-driven framework for computationally efficient and scalable sign recognition

    Full text link
    We introduce a new general framework for sign recognition from monocular video using limited quantities of annotated data. The novelty of the hybrid framework we describe here is that we exploit state-of-the art learning methods while also incorporating features based on what we know about the linguistic composition of lexical signs. In particular, we analyze hand shape, orientation, location, and motion trajectories, and then use CRFs to combine this linguistically significant information for purposes of sign recognition. Our robust modeling and recognition of these sub-components of sign production allow an efficient parameterization of the sign recognition problem as compared with purely data-driven methods. This parameterization enables a scalable and extendable time-series learning approach that advances the state of the art in sign recognition, as shown by the results reported here for recognition of isolated, citation-form, lexical signs from American Sign Language (ASL)

    Scalable ASL sign recognition using model-based machine learning and linguistically annotated corpora

    Get PDF
    We report on the high success rates of our new, scalable, computational approach for sign recognition from monocular video, exploiting linguistically annotated ASL datasets with multiple signers. We recognize signs using a hybrid framework combining state-of-the-art learning methods with features based on what is known about the linguistic composition of lexical signs. We model and recognize the sub-components of sign production, with attention to hand shape, orientation, location, motion trajectories, plus non-manual features, and we combine these within a CRF framework. The effect is to make the sign recognition problem robust, scalable, and feasible with relatively smaller datasets than are required for purely data-driven methods. From a 350-sign vocabulary of isolated, citation-form lexical signs from the American Sign Language Lexicon Video Dataset (ASLLVD), including both 1- and 2-handed signs, we achieve a top-1 accuracy of 93.3% and a top-5 accuracy of 97.9%. The high probability with which we can produce 5 sign candidates that contain the correct result opens the door to potential applications, as it is reasonable to provide a sign lookup functionality that offers the user 5 possible signs, in decreasing order of likelihood, with the user then asked to select the desired sign

    Developing a Prototype to Translate Pakistan Sign Language into Text and Speech While Using Convolutional Neural Networking

    Get PDF
    The purpose of the study is to provide a literature review of the work done on sign language in Pakistan and the world. This study also provides a framework of an already developed prototype to translate Pakistani sign language into speech and text while using convolutional neural networking (CNN) to facilitate unimpaired teachers to bridge the communication gap among the deaf learners and unimpaired teachers. Due to the lack of sign language teaching, unimpaired teachers face difficulty in communicating with impaired learners. This communication gap can be filled with the help of this translation tool. Research indicates that a prototype has been evolved that can translate the English textual content into sign language and highlighted that there is a need for translation tool which can translate the signs into English text. The current study will provide an architectural framework of the Pakistani sign language to English text translation tool that how different components of technology like deep learning, convolutional neural networking, python, tensor Flow, and NumPy, InceptionV3 and transfer learning, eSpeak text to speech help in the development of a translation tool prototype. Keywords: Pakistan sign language (PSL), sign language (SL), translation, deaf, unimpaired, convolutional neural networking (CNN). DOI: 10.7176/JEP/10-15-18 Publication date:May 31st 201

    Fully Convolutional Networks for Continuous Sign Language Recognition

    Full text link
    Continuous sign language recognition (SLR) is a challenging task that requires learning on both spatial and temporal dimensions of signing frame sequences. Most recent work accomplishes this by using CNN and RNN hybrid networks. However, training these networks is generally non-trivial, and most of them fail in learning unseen sequence patterns, causing an unsatisfactory performance for online recognition. In this paper, we propose a fully convolutional network (FCN) for online SLR to concurrently learn spatial and temporal features from weakly annotated video sequences with only sentence-level annotations given. A gloss feature enhancement (GFE) module is introduced in the proposed network to enforce better sequence alignment learning. The proposed network is end-to-end trainable without any pre-training. We conduct experiments on two large scale SLR datasets. Experiments show that our method for continuous SLR is effective and performs well in online recognition.Comment: Accepted to ECCV202
    • …
    corecore