13 research outputs found

    A Novel Vision based Finger-writing Character Recognition System

    Full text link

    The impact of geometric and motion features on sign language translators

    Get PDF
    Malaysian Sign Language (MSL) recognition system is a choice of augmenting communication between the hearing-impaired and hearing communities in Malaysia. Automatic translators can play an important role as alternative communication method for the hearing people to understand the hearing impaired ones. Automatic Translation using bare hands with natural gesture signing is a challenge in the field of machine learning. Researchers have used electronic and coloured gloves to solve mainly three issues during the preprocessing steps before the singings’ recognition stage. First issue is to differentiate the two hands from other objects. This is referred to as hand detection. The second issue is to describe the detected hand and its motion trajectory in very descriptive details which is referred to as feature extraction stage. The third issue is to find the starting and ending duration of the sign (transitions between signs). This paper focuses on the second issue, the feature extraction by studying the impact of the vector dimensions of the features. At the same time, signs with similar attributes have been chosen to highlight the importance of features’ extraction stage. The study also includes Hidden Markov Model (HMM) capability to differentiate between signs which have similar attributes

    Dynamic approach for real-time skin detection

    Get PDF
    Human face and hand detection, recognition and tracking are important research areas for many computer interaction applications. Face and hand are considered as human skin blobs, which fall in a compact region of colour spaces. Limitations arise from the fact that human skin has common properties and can be defined in various colour spaces after applying colour normalization. The model therefore, has to accept a wide range of colours, making it more susceptible to noise. We have addressed this problem and propose that the skin colour could be defined separately for every person. This is expected to reduce the errors. To detect human skin colour pixels and to decrease the number of false alarms, a prior face or hand detection model has been developed using Haar-like and AdaBoost technique. To decrease the cost of computational time, a fast search algorithm for skin detection is proposed. The level of performance reached in terms of detection accuracy and processing time allows this approach to be an adequate choice for real-time skin blob tracking

    Hand Gesture Recognition for Real Time Human Machine Interaction System

    Full text link

    Using Mobile Phone to Assist DHH Individuals

    Get PDF
    Past research on sign language recognition has mostly been based on physical information obtained via wearable devices or depth cameras. However, both types of devices are costly and inconvenient to carry, making it difficult to gain widespread acceptance by potential users. This research aims to use sophisticated and recently developed deep learning technology to build a recognition model for a Taiwanese version of sign language, with a limited focus on RGB images for training and recognition. It is hoped that this research, which makes use of lightweight devices such as mobile phones and webcams, will make a significant contribution to the communication needs of deaf and hard-of-hearing (DHH) individuals

    Vision-based hand shape identification for sign language recognition

    Get PDF
    This thesis introduces an approach to obtain image-based hand features to accurately describe hand shapes commonly found in the American Sign Language. A hand recognition system capable of identifying 31 hand shapes from the American Sign Language was developed to identify hand shapes in a given input image or video sequence. An appearance-based approach with a single camera is used to recognize the hand shape. A region-based shape descriptor, the generic Fourier descriptor, invariant of translation, scale, and orientation, has been implemented to describe the shape of the hand. A wrist detection algorithm has been developed to remove the forearm from the hand region before the features are extracted. The recognition of the hand shapes is performed with a multi-class Support Vector Machine. Testing provided a recognition rate of approximately 84% based on widely varying testing set of approximately 1,500 images and training set of about 2,400 images. With a larger training set of approximately 2,700 images and a testing set of approximately 1,200 images, a recognition rate increased to about 88%

    Detecção facial: autofaces versus antifaces

    Get PDF
    Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico. Programa de Pós-Graduação em Engenharia Elétrica.No presente trabalho, é desenvolvido um estudo comparativo entre duas técnicas de detecção facial baseadas em projeções vetoriais: Autofaces e Antifaces. O método de Autofaces tem sido significativamente estudado nos últimos anos, enquanto o de Antifaces é ainda considerado o estado-da-arte para a detecção de objetos. Ambos os métodos são descritos de forma detalhada e, para o método de Antifaces, é proposto um procedimento que permite obter os detectores subótimos. Ambos os métodos são avaliados em condições idênticas de teste. Tais avaliações consideram detecções de características faciais, de objetos tridimensionais e de uma face específica, vista de um ângulo frontal. Finalmente, é feita uma análise de sensibilidade dos métodos ao ruído branco Gaussiano aditivo, a distorções no foco e a alterações na cena em que se apresenta o objeto de interesse. Através dos resultados obtidos, é possível constatar que, no método de Antifaces, os critérios para a determinação de algumas variáveis de projeto não estão ainda bem estabelecidos. Além disso, esse método apresenta alta seletividade durante o processo de detecção. O método de Autofaces possui maior capacidade de generalização e menor sensibilidade à adição de ruído, distorções no foco e alterações na cena

    The Effects of Visual Affordances and Feedback on a Gesture-based Interaction with Novice Users

    Get PDF
    This dissertation studies the roles and effects of visual affordances and feedback in a general-purpose gesture interface for novice users. Gesture interfaces are popularly viewed as intuitive and user-friendly modes of interacting with computers and robots, but they in fact introduce many challenges for users not already familiar with the system. Affordances and feedback – two fundamental building blocks of interface design – are perfectly suited to address the most important challenges and questions for novices using a gesture interface: what can they do? how do they do it? are they being understood? has anything gone wrong? Yet gesture interfaces rarely incorporate these features in a deliberate manner, and there are presently no well-adopted guidelines for designing affordances and feedback for gesture interaction, nor any clear understanding of their effects on such an interaction. A general-purpose gesture interaction system was developed based on a virtual touchscreen paradigm, and guided by a novel gesture interaction framework. This framework clarifies the relationship between gesture interfaces and the application interfaces they support, and it provides guidance for selecting and designing appropriate affordances and feedback. Using this gesture system, a 40-person (all novices) user study was conducted to evaluate the effects on interaction performance and user satisfaction of four categories of affordances and feedback. The experimental results demonstrated that affordances indicating how to do something in a gesture interaction are more important to interaction performance than affordances indicating what can be done, and also that system status is more important than feedback acknowledging user actions. However, the experiments also showed unexpectedly high interaction performance when affordances and feedback were omitted. The explanation for this result remains an open question, though several potential causes are analyzed, and a tentative interpretation is provided. The main contributions of this dissertation to the HRI and HCI research communities are 1) the design of a virtual touchscreen-based interface for general-purpose gesture interaction, to serve as a case study for identifying and designing affordances and feedback for gesture interfaces; 2) the method and surprising results of an evaluation of distinct affordance and feedback categories, in particular their effects on a gesture interaction with novice users; and 3) a set of guidelines and insights about the relationship between a user, a gesture interface, and a generic application interface, centered on a novel interaction framework that may be used to design and study other gesture systems. In addition to the intellectual contributions, this work is useful to the general public because it may influence how future assistive robots are designed to interact with people in various settings including search and rescue, healthcare and elderly care

    Computer vision methods for unconstrained gesture recognition in the context of sign language annotation

    Get PDF
    Cette thèse porte sur l'étude des méthodes de vision par ordinateur pour la reconnaissance de gestes naturels dans le contexte de l'annotation de la Langue des Signes. La langue des signes (LS) est une langue gestuelle développée par les sourds pour communiquer. Un énoncé en LS consiste en une séquence de signes réalisés par les mains, accompagnés d'expressions du visage et de mouvements du haut du corps, permettant de transmettre des informations en parallèles dans le discours. Même si les signes sont définis dans des dictionnaires, on trouve une très grande variabilité liée au contexte lors de leur réalisation. De plus, les signes sont souvent séparés par des mouvements de co-articulation. Cette extrême variabilité et l'effet de co-articulation représentent un problème important dans les recherches en traitement automatique de la LS. Il est donc nécessaire d'avoir de nombreuses vidéos annotées en LS, si l'on veut étudier cette langue et utiliser des méthodes d'apprentissage automatique. Les annotations de vidéo en LS sont réalisées manuellement par des linguistes ou experts en LS, ce qui est source d'erreur, non reproductible et extrêmement chronophage. De plus, la qualité des annotations dépend des connaissances en LS de l'annotateur. L'association de l'expertise de l'annotateur aux traitements automatiques facilite cette tâche et représente un gain de temps et de robustesse. Le but de nos recherches est d'étudier des méthodes de traitement d'images afin d'assister l'annotation des corpus vidéo: suivi des composantes corporelles, segmentation des mains, segmentation temporelle, reconnaissance de gloses. Au cours de cette thèse nous avons étudié un ensemble de méthodes permettant de réaliser l'annotation en glose. Dans un premier temps, nous cherchons à détecter les limites de début et fin de signe. Cette méthode d'annotation nécessite plusieurs traitements de bas niveau afin de segmenter les signes et d'extraire les caractéristiques de mouvement et de forme de la main. D'abord nous proposons une méthode de suivi des composantes corporelles robuste aux occultations basée sur le filtrage particulaire. Ensuite, un algorithme de segmentation des mains est développé afin d'extraire la région des mains même quand elles se trouvent devant le visage. Puis, les caractéristiques de mouvement sont utilisées pour réaliser une première segmentation temporelle des signes qui est par la suite améliorée grâce à l'utilisation de caractéristiques de forme. En effet celles-ci permettent de supprimer les limites de segmentation détectées en milieu des signes. Une fois les signes segmentés, on procède à l'extraction de caractéristiques visuelles pour leur reconnaissance en termes de gloses à l'aide de modèles phonologiques. Nous avons évalué nos algorithmes à l'aide de corpus internationaux, afin de montrer leur avantages et limitations. L'évaluation montre la robustesse de nos méthodes par rapport à la dynamique et le grand nombre d'occultations entre les différents membres. L'annotation résultante est indépendante de l'annotateur et représente un gain de robustese important.This PhD thesis concerns the study of computer vision methods for the automatic recognition of unconstrained gestures in the context of sign language annotation. Sign Language (SL) is a visual-gestural language developed by deaf communities. Continuous SL consists on a sequence of signs performed one after another involving manual and non-manual features conveying simultaneous information. Even though standard signs are defined in dictionaries, we find a huge variability caused by the context-dependency of signs. In addition signs are often linked by movement epenthesis which consists on the meaningless gesture between signs. The huge variability and the co-articulation effect represent a challenging problem during automatic SL processing. It is necessary to have numerous annotated video corpus in order to train statistical machine translators and study this language. Generally the annotation of SL video corpus is manually performed by linguists or computer scientists experienced in SL. However manual annotation is error-prone, unreproducible and time consuming. In addition de quality of the results depends on the SL annotators knowledge. Associating annotator knowledge to image processing techniques facilitates the annotation task increasing robustness and speeding up the required time. The goal of this research concerns on the study and development of image processing technique in order to assist the annotation of SL video corpus: body tracking, hand segmentation, temporal segmentation, gloss recognition. Along this PhD thesis we address the problem of gloss annotation of SL video corpus. First of all we intend to detect the limits corresponding to the beginning and end of a sign. This annotation method requires several low level approaches for performing temporal segmentation and for extracting motion and hand shape features. First we propose a particle filter based approach for robustly tracking hand and face robust to occlusions. Then a segmentation method for extracting hand when it is in front of the face has been developed. Motion is used for segmenting signs and later hand shape is used to improve the results. Indeed hand shape allows to delete limits detected in the middle of a sign. Once signs have been segmented we proceed to the gloss recognition using lexical description of signs. We have evaluated our algorithms using international corpus, in order to show their advantages and limitations. The evaluation has shown the robustness of the proposed methods with respect to high dynamics and numerous occlusions between body parts. Resulting annotation is independent on the annotator and represents a gain on annotation consistency
    corecore