3,536 research outputs found

    A new multi-modal database for developing speech recognition systems for an assistive technology application

    Get PDF
    In this paper we report on the acquisition and content of a new database intended for developing audio-visual speech recognition systems. This database supports a speaker dependent continuous speech recognition task, based on a small vocabulary, and was captured in the European Portuguese language. Along with the collected multi-modal speech materials, the respective orthographic transcription and time-alignment files are supplied. The package also includes data on stochastic language models and the generative grammar associated to the collected spoken sentences. The application addressed by this database, which consists of voice control of a basic scientific calculator, has the particularity of being designed for a person with a specific motor impairment, namely muscular dystrophy. This specificity is a remarkable characteristic, given the lack of such kind of data resources for developing assistive systems based on audio-visual speech recognition technology

    Affective Medicine: a review of Affective Computing efforts in Medical Informatics

    Get PDF
    Background: Affective computing (AC) is concerned with emotional interactions performed with and through computers. It is defined as “computing that relates to, arises from, or deliberately influences emotions”. AC enables investigation and understanding of the relation between human emotions and health as well as application of assistive and useful technologies in the medical domain. Objectives: 1) To review the general state of the art in AC and its applications in medicine, and 2) to establish synergies between the research communities of AC and medical informatics. Methods: Aspects related to the human affective state as a determinant of the human health are discussed, coupled with an illustration of significant AC research and related literature output. Moreover, affective communication channels are described and their range of application fields is explored through illustrative examples. Results: The presented conferences, European research projects and research publications illustrate the recent increase of interest in the AC area by the medical community. Tele-home healthcare, AmI, ubiquitous monitoring, e-learning and virtual communities with emotionally expressive characters for elderly or impaired people are few areas where the potential of AC has been realized and applications have emerged. Conclusions: A number of gaps can potentially be overcome through the synergy of AC and medical informatics. The application of AC technologies parallels the advancement of the existing state of the art and the introduction of new methods. The amount of work and projects reviewed in this paper witness an ambitious and optimistic synergetic future of the affective medicine field

    Multimodal Based Audio-Visual Speech Recognition for Hard-of-Hearing: State of the Art Techniques and Challenges

    Get PDF
    Multimodal Integration (MI) is the study of merging the knowledge acquired by the nervous system using sensory modalities such as speech, vision, touch, and gesture. The applications of MI expand over the areas of Audio-Visual Speech Recognition (AVSR), Sign Language Recognition (SLR), Emotion Recognition (ER), Bio Metrics Applications (BMA), Affect Recognition (AR), Multimedia Retrieval (MR), etc. The fusion of modalities such as hand gestures- facial, lip- hand position, etc., are mainly used sensory modalities for the development of hearing-impaired multimodal systems. This paper encapsulates an overview of multimodal systems available within literature towards hearing impaired studies. This paper also discusses some of the studies related to hearing-impaired acoustic analysis. It is observed that very less algorithms have been developed for hearing impaired AVSR as compared to normal hearing. Thus, the study of audio-visual based speech recognition systems for the hearing impaired is highly demanded for the people who are trying to communicate with natively speaking languages.  This paper also highlights the state-of-the-art techniques in AVSR and the challenges faced by the researchers for the development of AVSR systems

    Developing e-Examination Voice interface for Visually Impaired students in Open and Distance Learning Context

    Get PDF
    Voice-based systems allow users access to information on the internet over a voice interface. Prior studies on Open and Distance Learning (ODL) eexamination systems that make use of voice interface do not sufficiently exhibit intelligent form of assessment, which diminishes the rigour of examination. The objective of this paper is to improve on the achievements of previous studies by providing a framework that will guide the development of a voice-based e-examination expert system for the visually impaired students in ODL. The study employs a combination of technologies such as system design, server side scripting, voice-based system development, data management and rule-based reasoning in developing the system. The system was evaluated to determine the level of usability. The results of the usability evaluation showed that the developed application has an ‘average usability’ rating of 3.48 out of 5 scales. The findings show that the voice-based e-examination system will not only be of immense benefit to the visually impaired students in ODL in respective of distance, but will also complement the existing web-based method for online examination

    AI and Non AI Assessments for Dementia

    Full text link
    Current progress in the artificial intelligence domain has led to the development of various types of AI-powered dementia assessments, which can be employed to identify patients at the early stage of dementia. It can revolutionize the dementia care settings. It is essential that the medical community be aware of various AI assessments and choose them considering their degrees of validity, efficiency, practicality, reliability, and accuracy concerning the early identification of patients with dementia (PwD). On the other hand, AI developers should be informed about various non-AI assessments as well as recently developed AI assessments. Thus, this paper, which can be readable by both clinicians and AI engineers, fills the gap in the literature in explaining the existing solutions for the recognition of dementia to clinicians, as well as the techniques used and the most widespread dementia datasets to AI engineers. It follows a review of papers on AI and non-AI assessments for dementia to provide valuable information about various dementia assessments for both the AI and medical communities. The discussion and conclusion highlight the most prominent research directions and the maturity of existing solutions.Comment: 49 page

    Future bathroom: A study of user-centred design principles affecting usability, safety and satisfaction in bathrooms for people living with disabilities

    Get PDF
    Research and development work relating to assistive technology 2010-11 (Department of Health) Presented to Parliament pursuant to Section 22 of the Chronically Sick and Disabled Persons Act 197

    Multimodal music information processing and retrieval: survey and future challenges

    Full text link
    Towards improving the performance in various music information processing tasks, recent studies exploit different modalities able to capture diverse aspects of music. Such modalities include audio recordings, symbolic music scores, mid-level representations, motion, and gestural data, video recordings, editorial or cultural tags, lyrics and album cover arts. This paper critically reviews the various approaches adopted in Music Information Processing and Retrieval and highlights how multimodal algorithms can help Music Computing applications. First, we categorize the related literature based on the application they address. Subsequently, we analyze existing information fusion approaches, and we conclude with the set of challenges that Music Information Retrieval and Sound and Music Computing research communities should focus in the next years

    Multimodal interface for an intelligent wheelchair

    Get PDF
    Tese de mestrado integrado. Engenharia Informática e Computação. Universidade do Porto. Faculdade de Engenharia. 201
    corecore