159,842 research outputs found

    SYSTEM AND METHOD FOR SPEECH RECOGNITION

    Get PDF
    The present disclosure is directed to a system and method for speech recognition. A user will provide a speech input to a computing device, which will be recognized and processed. The user can also provide authorization to the computing device to sense and process the speech input via one or more sensor(s) that are utilized to detect the position, movement, etc. of the user’s lips, and in some aspects tongue. If explicitly authorized by the user, the computing device can receive the speech input via the one or more sensors that are utilized to detect the position, movement, etc. of the user’s lips, and in some aspects tongue. The one or more sensors can, for example, be contactless (proximity detection) sensors and/or touch sensors. The speech input can (but need not) be audible speech of the user that is also received by a microphone of the computing device. Further areas of applicability of the present disclosure will become apparent from the detailed description provided hereinafter. It should be understood that the detailed description and specific examples are intended for purposes of illustration only and are not intended to limit the scope of the disclosure

    A real-time statistical time-series analyzer

    Get PDF
    Device extracts average frequency of human speech and produces second, third, and fourth moments of instantaneous frequency about this average. It operates on electrical time representation of input signal, performs statistical analysis on zero-crossing of almost any signal, and does not require specialized personnel to operate it

    Human Robot Interface for Assistive Grasping

    Full text link
    This work describes a new human-in-the-loop (HitL) assistive grasping system for individuals with varying levels of physical capabilities. We investigated the feasibility of using four potential input devices with our assistive grasping system interface, using able-bodied individuals to define a set of quantitative metrics that could be used to assess an assistive grasping system. We then took these measurements and created a generalized benchmark for evaluating the effectiveness of any arbitrary input device into a HitL grasping system. The four input devices were a mouse, a speech recognition device, an assistive switch, and a novel sEMG device developed by our group that was connected either to the forearm or behind the ear of the subject. These preliminary results provide insight into how different interface devices perform for generalized assistive grasping tasks and also highlight the potential of sEMG based control for severely disabled individuals.Comment: 8 pages, 21 figure

    The "Tiepstem" : an experimental Dutch keyboard-to-speech system for the speech impaired

    Get PDF
    An experimental Dutch keyboard-to-speech system has been developed to explor the possibilities and limitations of Dutch speech synthesis in a communication aid for the speech impaired. The system uses diphones and a formant synthesizer chip for speech synthesis. Input to the system is in pseudo-phonetic notation. Intonation contours using a declination line and various rises and falls are generated starting from an input consisting of punctuation and accent marks. The hardware design has resulted in a small, portable and battery-powered device. A short evaluation with users has been carried out, which has shown possibilities for such a device but has also indicated some problems with the current pseudo-phonetic input

    Simultaneous multimodal user interface

    Get PDF
    This disclosure describes techniques to use a combination of voice based input with other input mechanism in human-computer interfaces. The techniques enable users to utilize in real-time, the suitable input mode(s) in a given context, without having to switch between input modes. With user permission, speech analysis techniques are utilized to analyze user speech and detect when speech includes user instructions, and to determine corresponding actions to be performed. By enabling simultaneous user input via multiple modes, the techniques facilitate effective navigation of complex tasks performed using a computing device

    Providing Real-Time Captured Information Input to a Machine-Learned Model to Improve Query Services

    Get PDF
    This publication describes techniques and methods that a computing device uses to provide improved query services (e.g., autofill suggestions, speech biasing for automatic speech recognition) to applications on the computing device. To this end, an information collector on the computing device collects application activity information, information displayed on a display, and event information. This collected information can be provided as input to a machine-learned model implemented on the computing device. Responsive to the input received, the machine-learned model can classify the collected information to determine relevant attributes (e.g., keywords, searched locations, names) and make suggestions for utilization by query services provided by the computing device. Through these techniques and methods, user privacy is maintained, less power is consumed by the computing device, and the resources of the computing device (e.g., memory) are conserved

    Incorporating Device Context In Natural Language Understanding

    Get PDF
    Automatic speech recognition (ASR) models are used to recognize user commands or queries in products such as smartphones, smart speakers/displays, and other products that enable speech interaction. Automatic speech recognition is a complex problem that requires correct processing of the acoustic and semantic signals from the voice input. Natural language understanding (NLU) systems sometimes fail to correctly interpret utterances that are associated with multiple possible intents. Per techniques described herein, device context features such as the identity of the foreground application and other information is utilized to disambiguate intent for a voice query. Incorporating device context as input to NLU models leads to improvement in the ability of the NLU models to correctly interpret utterances with ambiguous intent

    Investigating the success factors of expert users to inform device development

    Get PDF
    Objective: Expert user testing is a well recognised tool within user experience and human computer interaction design. Within the domain of assistive technology device design, however, this technique seems to be little used. It is suggested that studying the success factors of expert assistive technology device users may provide a valuable source of data to inform development of assistive technology devices. This paper presents an example of this technique, within the context of a number of studies carried out by the authors, using the example of preliminary data from a study informing the development of an innovative Augmentative and Alternative Communication (AAC) device. Main Content: The paper presents a qualitative study whose objective was to influence the design and further development of an innovative voice-input voice-output communication aid (Vivoca) which has previously reached proof-of-concept stage. The Vivoca device is designed for people with dysarthria and this dictates a number of specific constraints and considerations. In order to understand how Vivoca could be designed to be used successfully by people with dysarthria, this study aimed to identify the factors associated with expert users' successful use of current AAC devices. In order to allow comparison, the study included users with some understandable speech and also those with no understandable speech. The study procedure was designed to provide a profile of participants' communication methods and to identify the factors that participants felt made their communication successful. Results: Preliminary results from the study (currently underway) are presented, including a qualitative analysis of interview data, and data profiling participants' communication methods and context. Initial data has highlighted the very specific requirements for a communication aid design for people with some understandable speech. Conclusion: Study of expert users may provide an effective tool to help inform assistive technology device development

    COMPARISON FOR SPEECH CODING ALGORITHMS FOR TOTAL LARYNGECTOMIES

    Get PDF
    Electrolarynx is used as a noninvasive supporting device for speech restoration in people who have undergone resection operation over their larynxes. This work aims to develop a signal processing method to neutralize the mechanical vibration noise of this device. We investigate the effect of this noise on the speech signal and analyze the performances of various algorithms in a single input system to minimize this noise

    Voice Signal Translation on Wireless-Communication Devices for the Speech and/or Hearing Impaired

    Get PDF
    Hearing impaired and/or speech impaired individuals can experience difficulties in communicating with other people when utilizing voice transmission of wireless-communication devices, such as smartphones. This publication describes techniques for aiding hearing impaired and/or speech impaired individuals in communicating during phone calls. The techniques include a translator application on a wireless-communication device that receives incoming voice signals, analyzes the voice signals, and generates a display (e.g., digital text, sign language) on the user interface (UI) of the wireless-communication device for the user’s visualization. Moreover, this publication describes techniques for aiding hearing impaired and/or speech impaired individuals, through use of a translator application on a wireless-communication device, to communicate with another individual utilizing voice transmissions. For instance, by providing an input (e.g., typing a message utilizing an input device connected to the wireless-communication device or expressing language via gestures (e.g., sign language) utilizing a video camera of the wireless-communication device) that is received by the translator application and converted to a voice signal that is then transmitted to the other individual. Additionally, the translator application can translate between different languages
    • …
    corecore