Search CORE

3,029 research outputs found

Large-scale learning of sign language by watching TV

Author: Charles James
Pfister Tomas
Zisserman Andrew
Publication venue: 'British Machine Vision Association and Society for Pattern Recognition'
Publication date: 01/01/2013
Field of study

The goal of this work is to automatically learn a large number of signs from sign language-interpreted TV broadcasts. We achieve this by exploiting supervisory information available in the subtitles of the broadcasts. However, this information is both weak and noisy and this leads to a challenging correspondence problem when trying to identify the temporal window of the sign. We make the following contributions: (i) we show that, somewhat counter-intuitively, mouth patterns are highly informative for isolating words in a language for the Deaf, and their co-occurrence with signing can be used to significantly reduce the correspondence search space; and (ii) we develop a multiple instance learning method using an efficient discriminative search, which determines a candidate list for the sign with both high recall and precision. We demonstrate the method on videos from BBC TV broadcasts, and achieve higher accuracy and recall than previous methods, despite using much simpler features

Oxford University Research Archive

A new framework for sign language recognition based on 3D handshape identification and linguistic modeling

Author: Dilsizian Mark
Metaxas Dimitris
Neidle Carol
Wang Shu
Yanovich Polina
Publication venue: EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA
Publication date: 01/01/2014
Field of study

Current approaches to sign recognition by computer generally have at least some of the following limitations: they rely on laboratory conditions for sign production, are limited to a small vocabulary, rely on 2D modeling (and therefore cannot deal with occlusions and off-plane rotations), and/or achieve limited success. Here we propose a new framework that (1) provides a new tracking method less dependent than others on laboratory conditions and able to deal with variations in background and skin regions (such as the face, forearms, or other hands); (2) allows for identification of 3D hand configurations that are linguistically important in American Sign Language (ASL); and (3) incorporates statistical information reflecting linguistic constraints in sign production. For purposes of large-scale computer-based sign language recognition from video, the ability to distinguish hand configurations accurately is critical. Our current method estimates the 3D hand configuration to distinguish among 77 hand configurations linguistically relevant for ASL. Constraining the problem in this way makes recognition of 3D hand configuration more tractable and provides the information specifically needed for sign recognition. Further improvements are obtained by incorporation of statistical information about linguistic dependencies among handshapes within a sign derived from an annotated corpus of almost 10,000 sign tokens

Boston University Institutional Repository (OpenBU)

Gesture and sign language recognition with deep learning

Author: Pigou Lionel
Publication venue: Ghent University. Faculty of Engineering and Architecture
Publication date: 01/01/2018
Field of study

Ghent University Academic Bibliography

Automatic recognition of fingerspelled words in British Sign Language

Author: Everingham M.
Liwicki S.
Publication venue
Publication date: 01/01/2009
Field of study

We investigate the problem of recognizing words from video, fingerspelled using the British Sign Language (BSL) fingerspelling alphabet. This is a challenging task since the BSL alphabet involves both hands occluding each other, and contains signs which are ambiguous from the observer’s viewpoint. The main contributions of our work include: (i) recognition based on hand shape alone, not requiring motion cues; (ii) robust visual features for hand shape recognition; (iii) scalability to large lexicon recognition with no re-training. We report results on a dataset of 1,000 low quality webcam videos of 100 words. The proposed method achieves a word recognition accuracy of 98.9%

CiteSeerX

Crossref

White Rose Research Online

Domain-adaptive discriminative one-shot learning of gestures

Author: B. Hariharan
D. Kelly
J. Wan
S. Ali
S. Fanello
S. Nayak
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

The objective of this paper is to recognize gestures in videos - both localizing the gesture and classifying it into one of multiple classes. We show that the performance of a gesture classifier learnt from a single (strongly supervised) training example can be boosted significantly using a 'reservoir' of weakly supervised gesture examples (and that the performance exceeds learning from the one-shot example or reservoir alone). The one-shot example and weakly supervised reservoir are from different 'domains' (different people, different videos, continuous or non-continuous gesturing, etc), and we propose a domain adaptation method for human pose and hand shape that enables gesture learning methods to generalise between them. We also show the benefits of using the recently introduced Global Alignment Kernel [12], instead of the standard Dynamic Time Warping that is generally used for time alignment. The domain adaptation and learning methods are evaluated on two large scale challenging gesture datasets: one for sign language, and the other for Italian hand gestures. In both cases performance exceeds the previous published results, including the best skeleton-classification-only entry in the 2013 ChaLearn challenge

Crossref

White Rose Research Online

Contextual Attention for Hand Detection in the Wild

Author: Hoai Minh
Narasimhaswamy Supreeth
Wang Yang
Wei Zhengwei
Zhang Justin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/11/2019
Field of study

We present Hand-CNN, a novel convolutional network architecture for detecting hand masks and predicting hand orientations in unconstrained images. Hand-CNN extends MaskRCNN with a novel attention mechanism to incorporate contextual cues in the detection process. This attention mechanism can be implemented as an efficient network module that captures non-local dependencies between features. This network module can be inserted at different stages of an object detection network, and the entire detector can be trained end-to-end. We also introduce large-scale annotated hand datasets containing hands in unconstrained images for training and evaluation. We show that Hand-CNN outperforms existing methods on the newly collected datasets and the publicly available PASCAL VOC human layout dataset. Data and code: https://www3.cs.stonybrook.edu/~cvl/projects/hand_det_attention

Crossref

Caltech Authors

Contextual Attention for Hand Detection in the Wild

Author: Hoai Minh
Narasimhaswamy Supreeth
Wang Yang
Wei Zhengwei
Zhang Justin
Publication venue
Publication date: 09/04/2019
Field of study

We present Hand-CNN, a novel convolutional network architecture for detecting hand masks and predicting hand orientations in unconstrained images. Hand-CNN extends MaskRCNN with a novel attention mechanism to incorporate contextual cues in the detection process. This attention mechanism can be implemented as an efficient network module that captures non-local dependencies between features. This network module can be inserted at different stages of an object detection network, and the entire detector can be trained end-to-end. We also introduce a large-scale annotated hand dataset containing hands in unconstrained images for training and evaluation. We show that Hand-CNN outperforms existing methods on several datasets, including our hand detection benchmark and the publicly available PASCAL VOC human layout challenge. We also conduct ablation studies on hand detection to show the effectiveness of the proposed contextual attention module.Comment: 9 pages, 9 figure

arXiv.org e-Print Archive

Crossref

Caltech Authors

Spotting Agreement and Disagreement: A Survey of Nonverbal Audiovisual Cues and Tools

Author: Bousmalis Konstantinos
Mehu Marc
Pantic Maja
Publication venue: IEEE Computer Society Press
Publication date: 01/01/2009
Field of study

While detecting and interpreting temporal patterns of non–verbal behavioral cues in a given context is a natural and often unconscious process for humans, it remains a rather difficult task for computer systems. Nevertheless, it is an important one to achieve if the goal is to realise a naturalistic communication between humans and machines. Machines that are able to sense social attitudes like agreement and disagreement and respond to them in a meaningful way are likely to be welcomed by users due to the more natural, efficient and human–centered interaction they are bound to experience. This paper surveys the nonverbal cues that could be present during agreement and disagreement behavioural displays and lists a number of tools that could be useful in detecting them, as well as a few publicly available databases that could be used to train these tools for analysis of spontaneous, audiovisual instances of agreement and disagreement

CiteSeerX

University of Twente Research Information