3,423 research outputs found
Automatic recognition of fingerspelled words in British Sign Language
We investigate the problem of recognizing words from
video, fingerspelled using the British Sign Language (BSL)
fingerspelling alphabet. This is a challenging task since the
BSL alphabet involves both hands occluding each other, and
contains signs which are ambiguous from the observer’s
viewpoint. The main contributions of our work include:
(i) recognition based on hand shape alone, not requiring
motion cues; (ii) robust visual features for hand shape
recognition; (iii) scalability to large lexicon recognition
with no re-training.
We report results on a dataset of 1,000 low quality webcam
videos of 100 words. The proposed method achieves a
word recognition accuracy of 98.9%
Continuous Action Recognition Based on Sequence Alignment
Continuous action recognition is more challenging than isolated recognition
because classification and segmentation must be simultaneously carried out. We
build on the well known dynamic time warping (DTW) framework and devise a novel
visual alignment technique, namely dynamic frame warping (DFW), which performs
isolated recognition based on per-frame representation of videos, and on
aligning a test sequence with a model sequence. Moreover, we propose two
extensions which enable to perform recognition concomitant with segmentation,
namely one-pass DFW and two-pass DFW. These two methods have their roots in the
domain of continuous recognition of speech and, to the best of our knowledge,
their extension to continuous visual action recognition has been overlooked. We
test and illustrate the proposed techniques with a recently released dataset
(RAVEL) and with two public-domain datasets widely used in action recognition
(Hollywood-1 and Hollywood-2). We also compare the performances of the proposed
isolated and continuous recognition algorithms with several recently published
methods
Video-based Sign Language Recognition without Temporal Segmentation
Millions of hearing impaired people around the world routinely use some
variants of sign languages to communicate, thus the automatic translation of a
sign language is meaningful and important. Currently, there are two
sub-problems in Sign Language Recognition (SLR), i.e., isolated SLR that
recognizes word by word and continuous SLR that translates entire sentences.
Existing continuous SLR methods typically utilize isolated SLRs as building
blocks, with an extra layer of preprocessing (temporal segmentation) and
another layer of post-processing (sentence synthesis). Unfortunately, temporal
segmentation itself is non-trivial and inevitably propagates errors into
subsequent steps. Worse still, isolated SLR methods typically require strenuous
labeling of each word separately in a sentence, severely limiting the amount of
attainable training data. To address these challenges, we propose a novel
continuous sign recognition framework, the Hierarchical Attention Network with
Latent Space (LS-HAN), which eliminates the preprocessing of temporal
segmentation. The proposed LS-HAN consists of three components: a two-stream
Convolutional Neural Network (CNN) for video feature representation generation,
a Latent Space (LS) for semantic gap bridging, and a Hierarchical Attention
Network (HAN) for latent space based recognition. Experiments are carried out
on two large scale datasets. Experimental results demonstrate the effectiveness
of the proposed framework.Comment: 32nd AAAI Conference on Artificial Intelligence (AAAI-18), Feb. 2-7,
2018, New Orleans, Louisiana, US
New Method for Optimization of License Plate Recognition system with Use of Edge Detection and Connected Component
License Plate recognition plays an important role on the traffic monitoring
and parking management systems. In this paper, a fast and real time method has
been proposed which has an appropriate application to find tilt and poor
quality plates. In the proposed method, at the beginning, the image is
converted into binary mode using adaptive threshold. Then, by using some edge
detection and morphology operations, plate number location has been specified.
Finally, if the plat has tilt, its tilt is removed away. This method has been
tested on another paper data set that has different images of the background,
considering distance, and angel of view so that the correct extraction rate of
plate reached at 98.66%.Comment: 3rd IEEE International Conference on Computer and Knowledge
Engineering (ICCKE 2013), October 31 & November 1, 2013, Ferdowsi Universit
Mashha
- …