293,225 research outputs found
An integrated sign language recognition system
Doctor EducationisResearch has shown that five parameters are required to recognize any sign language
gesture: hand shape, location, orientation and motion, as well as facial expressions. The
South African Sign Language (SASL) research group at the University of the Western
Cape has created systems to recognize Sign Language gestures using single parameters.
Using a single parameter can cause ambiguities in the recognition of signs that are
similarly signed resulting in a restriction of the possible vocabulary size. This research
pioneers work at the group towards combining multiple parameters to achieve a larger
recognition vocabulary set. The proposed methodology combines hand location and
hand shape recognition into one combined recognition system. The system is shown to
be able to recognize a very large vocabulary of 50 signs at a high average accuracy of
74.1%. This vocabulary size is much larger than existing SASL recognition systems,
and achieves a higher accuracy than these systems in spite of the large vocabulary. It
is also shown that the system is highly robust to variations in test subjects such as skin
colour, gender and body dimension. Furthermore, the group pioneers research towards
continuously recognizing signs from a video stream, whereas existing systems recognized a single sign at a time. To this end, a highly accurate continuous gesture segmentation strategy is proposed and shown to be able to accurately recognize sentences consisting of five isolated SASL gestures
Integration of a talking head into a Spanish Sign Language synthesizer
This is an electronic version of the paper presented at the Congreso Internacional de InteracciĂłn Persona-Ordenador, held in Bercelona on 2009In this paper, we present an integration of a talking head
within a Spanish Sign Language synthesizer. The whole system consists
of three different steps: First, the input acoustic signal is transformed
into a sequence of phones by means of a speech recognition process.
This sequence of phones is mapped in a second step to a sequence of
visemes and finally, the resulting sequence of visemes is played by means
of a talking head integrated into the avatar used in the Spanish Sign
Language synthesizer
Sensor fusion of motion-based sign language interpretation with deep learning
Sign language was designed to allow hearing-impaired people to interact with others. Nonetheless, knowledge of sign language is uncommon in society, which leads to a communication barrier with the hearing-impaired community. Many studies of sign language recognition utilizing computer vision (CV) have been conducted worldwide to reduce such barriers. However, this approach is restricted by the visual angle and highly affected by environmental factors. In addition, CV usually involves the use of machine learning, which requires collaboration of a team of experts and utilization of high-cost hardware utilities; this increases the application cost in real-world situations. Thus, this study aims to design and implement a smart wearable American Sign Language (ASL) interpretation system using deep learning, which applies sensor fusion that “fuses” six inertial measurement units (IMUs). The IMUs are attached to all fingertips and the back of the hand to recognize sign language gestures; thus, the proposed method is not restricted by the field of view. The study reveals that this model achieves an average recognition rate of 99.81% for dynamic ASL gestures. Moreover, the proposed ASL recognition system can be further integrated with ICT and IoT technology to provide a feasible solution to assist hearing-impaired people in communicating with others and improve their quality of life
Integration of a Spanish-to-LSE machine translation system into an e-learning platform
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-21657-2_61This paper presents the first results of the integration of a Spanish-to-LSE Machine Translation (MT) system into an e-learning platform. Most e-learning platforms provide speech-based contents, which makes them inaccessible to the Deaf. To solve this issue, we have developed a MT system that translates Spanish speech-based contents into LSE.
To test our MT system, we have integrated it into an e-learning tool. The e-learning tool sends the audio to our platform. The platform sends back the subtitles and a video stream with the signed translation to the e-learning tool.
Preliminary results, evaluating the sign language synthesis module, show an isolated sign recognition accuracy of 97%. The sentence recognition accuracy was of 93%.Authors would like to acknowledge the FPU-UAM grant program
for its financial support. Authors are grateful to the FCNSE linguistic department for sharing their knowledge in LSE and performing the evaluations. Many thanks go to MarĂa Chulvi and BenjamĂn Nogal for providing help during the imple-mentation of this system. This work was partially supported by the TelefĂłnica MĂłviles España S.A. project number 10-047158-TE-Ed-01-1
Hand gesture recognition using Kinect.
Hand gesture recognition (HGR) is an important research topic because some situations require silent communication with sign languages. Computational HGR systems assist silent communication, and help people learn a sign language. In this thesis. a novel method for contact-less HGR using Microsoft Kinect for Xbox is described, and a real-time HCR system is implemented with Microsoft Visual Studio 2010. Two different scenarios for HGR are provided: the Popular Gesture with nine gestures, and the Numbers with nine gestures. The system allows the users to select a scenario, and it is able to detect hand gestures made by users. to identify fingers, and to recognize the meanings of gestures, and to display the meanings and pictures on screen. The accuracy of the HGR system is from 84% to 99% with single hand gestures, and from 90% to 100% if both hands perform the same gesture at the same time. Because the depth sensor of Kinect is an infrared camera, the lighting conditions. signers\u27 skin colors and clothing, and background have little impact on the performance of this system. The accuracy and the robustness make this system a versatile component that can be integrated in a variety of applications in daily life
Upper body pose recognition and estimation towards the translation of South African sign language
Masters of ScienceRecognising and estimating gestures is a fundamental aspect towards translating from a sign language to a spoken language. It is a challenging problem and at the same time, a growing phenomenon in Computer Vision. This thesis presents two approaches, an example-based and a learning-based approach, for performing integrated detection, segmentation and 3D estimation of the human upper body from a single camera view. It investigates whether an upper body pose can be estimated from a database of exemplars with labelled poses. It also investigates whether an upper body pose can be estimated using skin feature extraction, Support Vector Machines (SVM) and a 3D human body model. The example-based and learning-based approaches obtained success rates of 64% and 88%, respectively. An analysis of the two approaches have shown that, although the learning-based system generally performs better than the example-based system, both approaches are suitable to recognise and estimate upper body poses in a South African sign language recognition and translation system.South Afric
Spelling it out: Real-time ASL fingerspelling recognition
This article presents an interactive hand shape recognition user interface for American Sign Language (ASL) finger-spelling. The system makes use of a Microsoft Kinect device to collect appearance and depth images, and of the OpenNI+NITE framework for hand detection and tracking. Hand-shapes corresponding to letters of the alphabet are characterized using appearance and depth images and classified using random forests. We compare classification using appearance and depth images, and show a combination of both lead to best results, and validate on a dataset of four different users. This hand shape detection works in real-time and is integrated in an interactive user interface allowing the signer to select between ambiguous detections and integrated with an English dictionary for efficient writing
DeepASL: Enabling Ubiquitous and Non-Intrusive Word and Sentence-Level Sign Language Translation
There is an undeniable communication barrier between deaf people and people
with normal hearing ability. Although innovations in sign language translation
technology aim to tear down this communication barrier, the majority of
existing sign language translation systems are either intrusive or constrained
by resolution or ambient lighting conditions. Moreover, these existing systems
can only perform single-sign ASL translation rather than sentence-level
translation, making them much less useful in daily-life communication
scenarios. In this work, we fill this critical gap by presenting DeepASL, a
transformative deep learning-based sign language translation technology that
enables ubiquitous and non-intrusive American Sign Language (ASL) translation
at both word and sentence levels. DeepASL uses infrared light as its sensing
mechanism to non-intrusively capture the ASL signs. It incorporates a novel
hierarchical bidirectional deep recurrent neural network (HB-RNN) and a
probabilistic framework based on Connectionist Temporal Classification (CTC)
for word-level and sentence-level ASL translation respectively. To evaluate its
performance, we have collected 7,306 samples from 11 participants, covering 56
commonly used ASL words and 100 ASL sentences. DeepASL achieves an average
94.5% word-level translation accuracy and an average 8.2% word error rate on
translating unseen ASL sentences. Given its promising performance, we believe
DeepASL represents a significant step towards breaking the communication
barrier between deaf people and hearing majority, and thus has the significant
potential to fundamentally change deaf people's lives
NEW shared & interconnected ASL resources: SignStream® 3 Software; DAI 2 for web access to linguistically annotated video corpora; and a sign bank
2017 marked the release of a new version of SignStream® software, designed to facilitate linguistic analysis of ASL video. SignStream® provides an intuitive interface for labeling and time-aligning manual and non-manual components of the signing. Version 3 has many new features. For example, it enables representation of morpho-phonological information, including display of handshapes. An expanding ASL video corpus, annotated through use of SignStream®, is shared publicly on the Web. This corpus (video plus annotations) is Web-accessible—browsable, searchable, and downloadable—thanks to a new, improved version of our Data Access Interface: DAI 2. DAI 2 also offers Web access to a brand new Sign Bank, containing about 10,000 examples of about 3,000 distinct signs, as produced by up to 9 different ASL signers. This Sign Bank is also directly accessible from within SignStream®, thereby boosting the efficiency and consistency of annotation; new items can also be added to the Sign Bank. Soon to be integrated into SignStream® 3 and DAI 2 are visualizations of computer-generated analyses of the video: graphical display of eyebrow height, eye aperture, an
Symbol Emergence in Robotics: A Survey
Humans can learn the use of language through physical interaction with their
environment and semiotic communication with other people. It is very important
to obtain a computational understanding of how humans can form a symbol system
and obtain semiotic skills through their autonomous mental development.
Recently, many studies have been conducted on the construction of robotic
systems and machine-learning methods that can learn the use of language through
embodied multimodal interaction with their environment and other systems.
Understanding human social interactions and developing a robot that can
smoothly communicate with human users in the long term, requires an
understanding of the dynamics of symbol systems and is crucially important. The
embodied cognition and social interaction of participants gradually change a
symbol system in a constructive manner. In this paper, we introduce a field of
research called symbol emergence in robotics (SER). SER is a constructive
approach towards an emergent symbol system. The emergent symbol system is
socially self-organized through both semiotic communications and physical
interactions with autonomous cognitive developmental agents, i.e., humans and
developmental robots. Specifically, we describe some state-of-art research
topics concerning SER, e.g., multimodal categorization, word discovery, and a
double articulation analysis, that enable a robot to obtain words and their
embodied meanings from raw sensory--motor information, including visual
information, haptic information, auditory information, and acoustic speech
signals, in a totally unsupervised manner. Finally, we suggest future
directions of research in SER.Comment: submitted to Advanced Robotic
- …