54 research outputs found

    A convolutional neural network to classify American Sign Language fingerspelling from depth and colour images

    Get PDF
    Sign language is used by approximately 70 million1 people throughout the world, and an automatic tool for interpreting it could make a major impact on communication between those who use it and those who may not understand it. However, computer interpretation of sign language is very difficult given the variability in size, shape and position of the fingers or hands in an image. Hence, this paper explores the applicability of deep learning for interpreting sign language. The paper develops a convolutional neural network aimed at classifying fingerspelling images using both image intensity and depth data. The developed convolutional network is evaluated by applying it to the problem of finger spelling recognition for American Sign Language. The evaluation shows that the developed convolutional network performs better than previous studies and has precision of 82% and recall of 80%. Analysis of the confusion matrix from the evaluation reveals the underlying difficulties of classifying some particular signs which is discussed in the paper

    Advanced Capsule Networks via Context Awareness

    Full text link
    Capsule Networks (CN) offer new architectures for Deep Learning (DL) community. Though its effectiveness has been demonstrated in MNIST and smallNORB datasets, the networks still face challenges in other datasets for images with distinct contexts. In this research, we improve the design of CN (Vector version) namely we expand more Pooling layers to filter image backgrounds and increase Reconstruction layers to make better image restoration. Additionally, we perform experiments to compare accuracy and speed of CN versus DL models. In DL models, we utilize Inception V3 and DenseNet V201 for powerful computers besides NASNet, MobileNet V1 and MobileNet V2 for small and embedded devices. We evaluate our models on a fingerspelling alphabet dataset from American Sign Language (ASL). The results show that CNs perform comparably to DL models while dramatically reducing training time. We also make a demonstration and give a link for the purpose of illustration.Comment: 12 page

    Gesture Recognition of RGB and RGB-D Static Images Using Convolutional Neural Networks

    Get PDF
    In this era, the interaction between Human and Computers has always been a fascinating field. With the rapid development in the field of Computer Vision, gesture based recognition systems have always been an interesting and diverse topic. Though recognizing human gestures in the form of sign language is a very complex and challenging task. Recently various traditional methods were used for performing sign language recognition but achieving high accuracy is still a challenging task. This paper proposes a RGB and RGB-D static gesture recognition method by using a fine-tuned VGG19 model. The fine-tuned VGG19 model uses a feature concatenate layer of RGB and RGB-D images for increasing the accuracy of the neural network. Finally, on an American Sign Language (ASL) Recognition dataset, the authors implemented the proposed model. The authors achieved 94.8% recognition rate and compared the model with other CNN and traditional algorithms on the same dataset

    Classification of Finger Spelling American Sign Language Using Convolutional Neural Network

    Get PDF
    Sign language is a combination of complex hand movements, body postures, and facial expressions. However, only a limited number of people can understand and use it. A computer aid sign language recognition with finger spelling style utilizing a convolutional neural network (CNN) is proposed to reduce the burden. We compared two CNN architectures such as Resnet 50, and DenseNet 121 to classify the American sign language dataset. Several data splitting proportions were also tested. From the experimental result, it is shown that the Resnet 50 architecture with 80:20 data splitting for training and testing indicates the best performance with an accuracy of 0.999913, sensitivity 0.998966, precision 0.998958, specificity 0.999955, F1-score 0.999913, and error 0.0000898

    AMERICAN SIGN LANGUAGE FINGERSPELLING USING HYBRID DISCRETE WAVELET TRANSFORM-GABOR FILTER AND CONVOLUTIONAL NEURAL NETWORK

    Get PDF
    American Sign Language (ASL) is widely used for communication by deaf and mute people. In fingerspelling, the letters of the writing system are represented using only hands. Generally, hearing people do not understand sign language and this creates a communication gap between the signer and speaker community. A real-time ASL fingerspelling recognizer can be developed to solve this problem. Sign language recognizer can also be trained for other applications such as human-computer interaction. In this paper, a hybrid Discrete Wavelet TransformGabor filter is used on the colour images to extract features. Classifiers are evaluated on signer dependent and independent datasets. For evaluation, it is very important to consider signer dependency. Random Forest, Support Vector Machine and K-Nearest Neighbors classifiers are evaluated on the extracted set of features to classify the 24 classes of ASL alphabets with 95.8%, 94.3% and 96.7% accuracy respectively on signer dependent dataset and 49.16%, 48.75% and 50.83% accuracy respectively on signer independent dataset. Lastly, Convolutional Neural Network was also trained and evaluated on both, which produced 97.01% accuracy on signer dependent and 76.25% accuracy on signer independent dataset

    Vision-Based American Sign Language Classification Approach via Deep Learning

    Full text link
    Hearing-impaired is the disability of partial or total hearing loss that causes a significant problem for communication with other people in society. American Sign Language (ASL) is one of the sign languages that most commonly used language used by Hearing impaired communities to communicate with each other. In this paper, we proposed a simple deep learning model that aims to classify the American Sign Language letters as a step in a path for removing communication barriers that are related to disabilities.Comment: 4 pages, Accepted in the The Florida AI Research Society (FLAIRS-35) 202

    Recognizing Handshapes using Small Datasets

    Get PDF
    Advances in convolutional neural networks have made possible significant improvements in the state-of-the-art in image classification. However, their success on a particular field rests on the possibility of obtaining labeled data to train networks. Handshape recognition from images, an important subtask of both gesture and sign language recognition, suffers from such a lack of data. Furthermore, hands are highly deformable objects and therefore handshape classification models require larger datasets. We analyze both state of the art models for image classification, as well as data augmentation schemes and specific models to tackle problems with small datasets. In particular, we perform experiments with Wide- DenseNet, a state of the art convolutional architecture and Prototypical Networks, a state of the art few-shot learning meta model. In both cases, we also quantify the impact of data augmentation on accuracy. Our results show that on small and simple data sets such as CIARP, all models and variations of achieve perfect accuracy, and therefore the utility of the data is highly doubtful, despite its having 6000 samples. On the other hand, in small but complex datasets such as LSA16 (800 samples), specialized methods such as Prototypical Networks do have an advantage over other methods. On RWTH, another complex and small dataset with close to 4000 samples, a traditional and state-of-the-art method such as Wide-DenseNet surpasses all other models. Also, data augmentation consistently increases accuracy for Wide-DenseNet, but not fo Prototypical Networks.XX Workshop de Agentes y Sistemas inteligentes.Red de Universidades con Carreras en Informátic

    Recognizing Handshapes using Small Datasets

    Get PDF
    Advances in convolutional neural networks have made possible significant improvements in the state-of-the-art in image classification. However, their success on a particular field rests on the possibility of obtaining labeled data to train networks. Handshape recognition from images, an important subtask of both gesture and sign language recognition, suffers from such a lack of data. Furthermore, hands are highly deformable objects and therefore handshape classification models require larger datasets. We analyze both state of the art models for image classification, as well as data augmentation schemes and specific models to tackle problems with small datasets. In particular, we perform experiments with Wide- DenseNet, a state of the art convolutional architecture and Prototypical Networks, a state of the art few-shot learning meta model. In both cases, we also quantify the impact of data augmentation on accuracy. Our results show that on small and simple data sets such as CIARP, all models and variations of achieve perfect accuracy, and therefore the utility of the data is highly doubtful, despite its having 6000 samples. On the other hand, in small but complex datasets such as LSA16 (800 samples), specialized methods such as Prototypical Networks do have an advantage over other methods. On RWTH, another complex and small dataset with close to 4000 samples, a traditional and state-of-the-art method such as Wide-DenseNet surpasses all other models. Also, data augmentation consistently increases accuracy for Wide-DenseNet, but not fo Prototypical Networks.XX Workshop de Agentes y Sistemas inteligentes.Red de Universidades con Carreras en Informátic
    corecore