1,624 research outputs found

    Multimodal Based Audio-Visual Speech Recognition for Hard-of-Hearing: State of the Art Techniques and Challenges

    Get PDF
    Multimodal Integration (MI) is the study of merging the knowledge acquired by the nervous system using sensory modalities such as speech, vision, touch, and gesture. The applications of MI expand over the areas of Audio-Visual Speech Recognition (AVSR), Sign Language Recognition (SLR), Emotion Recognition (ER), Bio Metrics Applications (BMA), Affect Recognition (AR), Multimedia Retrieval (MR), etc. The fusion of modalities such as hand gestures- facial, lip- hand position, etc., are mainly used sensory modalities for the development of hearing-impaired multimodal systems. This paper encapsulates an overview of multimodal systems available within literature towards hearing impaired studies. This paper also discusses some of the studies related to hearing-impaired acoustic analysis. It is observed that very less algorithms have been developed for hearing impaired AVSR as compared to normal hearing. Thus, the study of audio-visual based speech recognition systems for the hearing impaired is highly demanded for the people who are trying to communicate with natively speaking languages.  This paper also highlights the state-of-the-art techniques in AVSR and the challenges faced by the researchers for the development of AVSR systems

    Synchronizing Keyframe Facial Animation to Multiple Text-to-Speech Engines and Natural Voice with Fast Response Time

    Get PDF
    This thesis aims to create an automated lip-synchronization system for real-time applications. Specifically, the system is required to be fast, consist of a limited number of keyframes with small memory requirements, and create fluid and believable animations that synchronize with text-to-speech engines as well as raw voice data. The algorithms utilize traditional keyframe animation and a novel method of keyframe selection. Additionally, phoneme-to-keyframe mapping, synchronization, and simple blending rules are employed. The algorithms provide blending between keyframe images, borrow information from neighboring phonemes, accentuate phonemes b, p and m, differentiate between keyframes for phonemes with allophonic variations, and provide prosodromic variation by including emotion while speaking. The lip-sync animation synchronizes with multiple synthesized voices and human speech. A fast and versatile online real-time java chat interface is created to exhibit vivid facial animation. Results show that the animation algorithms are fast and show accurate lip-synchronization. Additionally, surveys showed that the animations are visually pleasing and improve speech understandability 96% of the time. Applications for this project include internet chat capabilities, interactive teaching of foreign languages, animated news broadcasting, enhanced game technology, and cell phone messaging

    Enabling audio-haptics

    Get PDF
    This thesis deals with possible solutions to facilitate orientation, navigation and overview of non-visual interfaces and virtual environments with the help of sound in combination with force-feedback haptics. Applications with haptic force-feedback, s

    Instructional eLearning technologies for the vision impaired

    Get PDF
    The principal sensory modality employed in learning is vision, and that not only increases the difficulty for vision impaired students from accessing existing educational media but also the new and mostly visiocentric learning materials being offered through on-line delivery mechanisms. Using as a reference Certified Cisco Network Associate (CCNA) and IT Essentials courses, a study has been made of tools that can access such on-line systems and transcribe the materials into a form suitable for vision impaired learning. Modalities employed included haptic, tactile, audio and descriptive text. How such a multi-modal approach can achieve equivalent success for the vision impaired is demonstrated. However, the study also shows the limits of the current understanding of human perception, especially with respect to comprehending two and three dimensional objects and spaces when there is no recourse to vision

    On the design of visual feedback for the rehabilitation of hearing-impaired speech

    Get PDF

    Multimedia mathematics intervention for math-delayed middle school students

    Get PDF
    The purpose of this study is to determine if the Sharpening Math Skills Lab technology-mediated mathematics instructional practices for math-delayed middle school students have positive effects on their mathematics achievement and spatial visualization ability and to gauge student engagement in learning, implementation of the principles of instructional design, and attitudes toward mathematics instruction. The results of a recent meta-analysis report a range of significantly positive to significantly negative effect sizes which establish a need for further evaluation of academic achievement utilizing technology-mediated mathematics programs at the middle school level (Slavin, Lake, & Groff, 2007). The literature (Moreno & Mayer, 2000) also suggests examining the principles of multimedia instructional design as they relate to programs such as those utilized in the Sharpening Math Skills Lab. The need for testing for relationships between student spatial visualization and problem solving ability (Wheatley, 1991), student attitudes and motivation toward mathematics (Tapia & Marsh, 2004), and students’ behavior while engaged in multimedia learning activities has also been established in the literature. This quasi-experimental study compares academic achievement of 109 southwest Louisiana 6th, 7th, and 8th grade students in one school who participated in a treatment program of technology-mediated remedial math instruction with 162 - 6th, 7th, and 8th grade students from two other schools in the same district who received traditional classroom mathematics instruction. The experimental group attended the Sharpening Math Skills Lab 45 minutes per day utilizing FASTTMath software and iSucceed software with individual assistance provided by the lab facilitator and math teacher. Measurement instruments include Scantron Performance and Achievement Series tests, Wheatley Spatial Ability Test (WSAT) (1996), and Attitudes Toward Math Survey (ATMI) (Tapia, 1996). Qualitative data about the experimental group including levels of engagement and the effectiveness of instructional design of the software utilized were also gathered. Positive outcomes of the study include making “best practices” recommendations for remedial mathematics instruction of math-delayed middle school students. Data accumulated in the study contributes to the body of evidence on the usefulness of technology-based remediation practices and provides important information to school officials in the development of curricular and budgetary decisions

    Proceedings of the 5th international conference on disability, virtual reality and associated technologies (ICDVRAT 2004)

    Get PDF
    The proceedings of the conferenc

    Proceedings of the 6th international conference on disability, virtual reality and associated technologies (ICDVRAT 2006)

    Get PDF
    The proceedings of the conferenc
    corecore