901 research outputs found

    A Sign-to-Speech Translation System

    Get PDF
    This thesis describes sign-to-speech translation using neural networks. Sign language translation is an interesting but difficult problem for which neural network techniques seem promising because of their ability to adjust to the user\u27s hand movements, which is not possible to do by most other techniques. However, even using neural networks and artificial sign languages, the translation is hard, and the best-known system, that of Fels & Hinton (1993), is capable of translating only 66 root words and 203 words including their conjugations. This research improves their results to 790 root signs and 2718 words including their conjugations while preserving a high accuracy (i.e., over 93 %) in translation. The use of matcher neural networks (Revesz 1989, 1990) and asymmetric Hamming distances are the key sources of improvement. This research aims at providing a means of communication for deaf people. Adviser: Peter Z. Reves

    Somatic ABC's: A Theoretical Framework for Designing, Developing and Evaluating the Building Blocks of Touch-Based Information Delivery

    Get PDF
    abstract: Situations of sensory overload are steadily becoming more frequent as the ubiquity of technology approaches reality--particularly with the advent of socio-communicative smartphone applications, and pervasive, high speed wireless networks. Although the ease of accessing information has improved our communication effectiveness and efficiency, our visual and auditory modalities--those modalities that today's computerized devices and displays largely engage--have become overloaded, creating possibilities for distractions, delays and high cognitive load; which in turn can lead to a loss of situational awareness, increasing chances for life threatening situations such as texting while driving. Surprisingly, alternative modalities for information delivery have seen little exploration. Touch, in particular, is a promising candidate given that it is our largest sensory organ with impressive spatial and temporal acuity. Although some approaches have been proposed for touch-based information delivery, they are not without limitations including high learning curves, limited applicability and/or limited expression. This is largely due to the lack of a versatile, comprehensive design theory--specifically, a theory that addresses the design of touch-based building blocks for expandable, efficient, rich and robust touch languages that are easy to learn and use. Moreover, beyond design, there is a lack of implementation and evaluation theories for such languages. To overcome these limitations, a unified, theoretical framework, inspired by natural, spoken language, is proposed called Somatic ABC's for Articulating (designing), Building (developing) and Confirming (evaluating) touch-based languages. To evaluate the usefulness of Somatic ABC's, its design, implementation and evaluation theories were applied to create communication languages for two very unique application areas: audio described movies and motor learning. These applications were chosen as they presented opportunities for complementing communication by offloading information, typically conveyed visually and/or aurally, to the skin. For both studies, it was found that Somatic ABC's aided the design, development and evaluation of rich somatic languages with distinct and natural communication units.Dissertation/ThesisPh.D. Computer Science 201

    Translating bus information into sign language for deaf people

    Get PDF
    This paper describes the application of language translation technologies for generating bus information in Spanish Sign Language (LSE: Lengua de Signos Española). In this work, two main systems have been developed: the first for translating text messages from information panels and the second for translating spoken Spanish into natural conversations at the information point of the bus company. Both systems are made up of a natural language translator (for converting a word sentence into a sequence of LSE signs), and a 3D avatar animation module (for playing back the signs). For the natural language translator, two technological approaches have been analyzed and integrated: an example-based strategy and a statistical translator. When translating spoken utterances, it is also necessary to incorporate a speech recognizer for decoding the spoken utterance into a word sequence, prior to the language translation module. This paper includes a detailed description of the field evaluation carried out in this domain. This evaluation has been carried out at the customer information office in Madrid involving both real bus company employees and deaf people. The evaluation includes objective measurements from the system and information from questionnaires. In the field evaluation, the whole translation presents an SER (Sign Error Rate) of less than 10% and a BLEU greater than 90%

    Designing a Sensor-Based Wearable Computing System for Custom Hand Gesture Recognition Using Machine Learning

    Get PDF
    This thesis investigates how assistive technology can be made to facilitate communication for people that are unable to or have difficulty communicating via vocal speech, and how this technology can be made more universal and compatible with the many different types of sign language that they use. Through this research, a fully customisable and stand-alone wearable device was developed, that employs machine learning techniques to recognise individual hand gestures and translate them into text, images and speech. The device can recognise and translate custom hand gestures by training a personal classifier for each user, relying on a small training sample size, that works online on an embedded system or mobile device, with a classification accuracy rate of up to 99%. This was achieved through a series of iterative case studies, with user testing carried out by real users in their every day environments and in public spaces

    Assistive technologies for severe and profound hearing loss: beyond hearing aids and implants

    Get PDF
    Assistive technologies offer capabilities that were previously inaccessible to individuals with severe and profound hearing loss who have no or limited access to hearing aids and implants. This literature review aims to explore existing assistive technologies and identify what still needs to be done. It is found that there is a lack of focus on the overall objectives of assistive technologies. In addition, several other issues are identified i.e. only a very small number of assistive technologies developed within a research context have led to commercial devices, there is a predisposition to use the latest expensive technologies and a tendency to avoid designing products universally. Finally, the further development of plug-ins that translate the text content of a website to various sign languages is needed to make information on the internet more accessible

    Machine learning methods for sign language recognition: a critical review and analysis.

    Get PDF
    Sign language is an essential tool to bridge the communication gap between normal and hearing-impaired people. However, the diversity of over 7000 present-day sign languages with variability in motion position, hand shape, and position of body parts making automatic sign language recognition (ASLR) a complex system. In order to overcome such complexity, researchers are investigating better ways of developing ASLR systems to seek intelligent solutions and have demonstrated remarkable success. This paper aims to analyse the research published on intelligent systems in sign language recognition over the past two decades. A total of 649 publications related to decision support and intelligent systems on sign language recognition (SLR) are extracted from the Scopus database and analysed. The extracted publications are analysed using bibliometric VOSViewer software to (1) obtain the publications temporal and regional distributions, (2) create the cooperation networks between affiliations and authors and identify productive institutions in this context. Moreover, reviews of techniques for vision-based sign language recognition are presented. Various features extraction and classification techniques used in SLR to achieve good results are discussed. The literature review presented in this paper shows the importance of incorporating intelligent solutions into the sign language recognition systems and reveals that perfect intelligent systems for sign language recognition are still an open problem. Overall, it is expected that this study will facilitate knowledge accumulation and creation of intelligent-based SLR and provide readers, researchers, and practitioners a roadmap to guide future direction

    An image processing technique for the improvement of Sign2 using a dual camera approach

    Get PDF
    A non-intrusive translation system to transform American Sign Language to digital text forms the pivotal point of discussion in the following thesis. With so many techniques which are being introduced for the same purpose in the present technological arena, this study lays claim to that relatively less trodden path of developing an unobtrusive, user-friendly and straightforward solution. The phase 1 of the Sign2 Project dealt with a single camera approach to achieve the same end of creating a translation system and my present investigation endeavors to develop a solution to improve the accuracy of results employing the methodology pursued in the Phase1 of the project. The present study is restricted to spelling out the American Sign Language alphabet and hence the only area of concentration would be the hand of the subject. This is as opposed to considering the entire ASL vocabulary which involves a more complex range of physical movement and intricate gesticulation. This investigation involved 3 subjects signing the ASL alphabet repetitively which were later used as a reference to recognize the letters in the words signed by the same subjects. Though the subject matter of this study does not differ by much from the Phase 1, the employment of an additional camera as a means to achieve better accuracy in results has been employed. The reasoning behind this approach is to attempt a closer imitation of the human depth perception. The best and most convincing information about the three dimensional world is attained by binocular vision and this theory is exploited in the current approach. For the purpose of this study, a humble attempt to come closer to the concept of binocular vision is made and only one aspect, that of the binocular disparity, is attempted to be emulated. The inference drawn from this analysis has proven the improved precision with which the ‘fist’ letters were identified. Owing to the fewer number of subjects and technical snags, the comprehensive body of data has been deprived to an extent but this thesis promises to deliver a basic foundation on which to build the future study and lays the guidelines to achieve a more complete and successful translation system

    GCTW Alignment for isolated gesture recognition

    Get PDF
    In recent years, there has been increasing interest in developing automatic Sign Language Recognition (SLR) systems because Sign Language (SL) is the main mode of communication between deaf people all over the world. However, most people outside the deaf community do not understand SL, generating a communication problem, between both communities. Recognizing signs is a challenging problem because manual signing (not taking into account facial gestures) has four components that have to be recognized, namely, handshape, movement, location and palm orientation. Even though the appearance and meaning of basic signs are well-defined in sign language dictionaries, in practice, many variations arise due to different factors like gender, age, education or regional, social and ethnic factors which can lead to significant variations making hard to develop a robust SL recognition system. This project attempts to introduce the alignment of videos into isolated SLR, given that this approach has not been studied deeply, even though it presents a great potential for correctly recognize isolated gestures. We also aim for a user-independent recognition, which means that the system should give have a good recognition accuracy for the signers that were not represented in the data set. The main features used for the alignment are the wrists coordinates that we extracted from the videos by using OpenPose. These features will be aligned by using Generalized Canonical Time Warping. The resultant videos will be classified by making use of a 3D CNN. Our experimental results show that the proposed method has obtained a 65.02% accuracy, which places us 5th in the 2017 Chalearn LAP isolated gesture recognition challenge, only 2.69% away from the first place.Trabajo de investigació

    Real-time New Zealand sign language translator using convolution neural network

    Get PDF
    Over the past quarter of a century, machine Learning performs an essential role in information technology revolution. From predictive internet web browsing to autonomous vehicles; machine learning has become the heart of all intelligence applications in service today. Image classification through gesture recognition is sub field which has benefited immensely from the existence of this machine learning method. In particular, a subset of Machine Learning known as deep learning has exhibited impressive performance in this regard while outperforming other conventional approaches such as image processing. The advanced Deep Learning architectures come with artificial neural networks particularly convolution neural networks (CNN). Deep Learning has dominated the field of computer vision since 2012; however, a general criticism of this deep learning method is its dependence on large datasets. In order to overcome this criticism, research focusing on discovering data- efficient deep learning methods have been carried out. The foremost finding of the data-efficient deep learning function is a transfer learning technique, which is basically carried out with pre-trained networks. In this research, the InceptionV3 pre trained model has been used to perform the transfer learning method in a convolution neural network to implement New Zealand sign language translator in real-time. The focus of this research is to introduce a vision-based application that offers New Zealand sign language translation into text format by recognizing sign gestures to overcome the communication barriers between the deaf community and hearing-unimpaired community in New Zealand. As a byproduct of this research work, a new dataset for New Zealand sign Language alphabet has been created. After training the pre-trained InceptionV3 network with this captured dataset, a prototype for this New Zealand sign language translating system has been created
    corecore