5,182 research outputs found

    A Comparison of Voice Controlled and Mouse Controlled Web Browsing (2000)

    Get PDF
    Voice controlled web browsers allow users to navigate by speaking the text of a link or an associated number instead of clicking with a mouse. One such browser is Conversa, by Conversational Computing. This within subjects study with 18 subjects compared voice browsing with traditional mouse-based browsing. It attempted to identify which of three common hypertext forms (linear slide show, grid/tiled map, and hierarchical menu) are well suited to voice navigation, and whether voice navigation is helped by numbering links. The study shows that voice control adds approximately 50 percent to the performance time for certain types of tasks. Subjective satisfaction measures indicate that for voice browsing, textual links are preferable to numbered links

    A Voice Controlled E-Commerce Web Application

    Full text link
    Automatic voice-controlled systems have changed the way humans interact with a computer. Voice or speech recognition systems allow a user to make a hands-free request to the computer, which in turn processes the request and serves the user with appropriate responses. After years of research and developments in machine learning and artificial intelligence, today voice-controlled technologies have become more efficient and are widely applied in many domains to enable and improve human-to-human and human-to-computer interactions. The state-of-the-art e-commerce applications with the help of web technologies offer interactive and user-friendly interfaces. However, there are some instances where people, especially with visual disabilities, are not able to fully experience the serviceability of such applications. A voice-controlled system embedded in a web application can enhance user experience and can provide voice as a means to control the functionality of e-commerce websites. In this paper, we propose a taxonomy of speech recognition systems (SRS) and present a voice-controlled commodity purchase e-commerce application using IBM Watson speech-to-text to demonstrate its usability. The prototype can be extended to other application scenarios such as government service kiosks and enable analytics of the converted text data for scenarios such as medical diagnosis at the clinics.Comment: 7 page

    INNOVATING CONTROL AND EMOTIONAL EXPRESSIVE MODALITIES OF USER INTERFACES FOR PEOPLE WITH LOCKED-IN SYNDROME

    Get PDF
    Patients with Lock-In-Syndrome (LIS) lost their ability to control any body part beside their eyes. Current solutions mainly use eye-tracking cameras to track patients' gaze as system input. However, despite the fact that interface design greatly impacts user experience, only a few guidelines have been were proposed so far to insure an easy, quick, fluid and non-tiresome computer system for these patients. On the other hand, the emergence of dedicated computer software has been greatly increasing the patients' capabilities, but there is still a great need for improvements as existing systems still present low usability and limited capabilities. Most interfaces designed for LIS patients aim at providing internet browsing or communication abilities. State of the art augmentative and alternative communication systems mainly focus on sentences communication without considering the need for emotional expression inextricable from human communication. This thesis aims at exploring new system control and expressive modalities for people with LIS. Firstly, existing gaze-based web-browsing interfaces were investigated. Page analysis and high mental workload appeared as recurring issues with common systems. To address this issue, a novel user interface was designed and evaluated against a commercial system. The results suggested that it is easier to learn and to use, quicker, more satisfying, less frustrating, less tiring and less prone to error. Mental workload was greatly diminished with this system. Other types of system control for LIS patients were then investigated. It was found that galvanic skin response may be used as system input and that stress related bio-feedback helped lowering mental workload during stressful tasks. Improving communication was one of the main goal of this research and in particular emotional communication. A system including a gaze-controlled emotional voice synthesis and a personal emotional avatar was developed with this purpose. Assessment of the proposed system highlighted the enhanced capability to have dialogs more similar to normal ones, to express and to identify emotions. Enabling emotion communication in parallel to sentences was found to help with the conversation. Automatic emotion detection seemed to be the next step toward improving emotional communication. Several studies established that physiological signals relate to emotions. The ability to use physiological signals sensors with LIS patients and their non-invasiveness made them an ideal candidate for this study. One of the main difficulties of emotion detection is the collection of high intensity affect-related data. Studies in this field are currently mostly limited to laboratory investigations, using laboratory-induced emotions, and are rarely adapted for real-life applications. A virtual reality emotion elicitation technique based on appraisal theories was proposed here in order to study physiological signals of high intensity emotions in a real-life-like environment. While this solution successfully elicited positive and negative emotions, it did not elicit the desired emotions for all subject and was therefore, not appropriate for the goals of this research. Collecting emotions in the wild appeared as the best methodology toward emotion detection for real-life applications. The state of the art in the field was therefore reviewed and assessed using a specifically designed method for evaluating datasets collected for emotion recognition in real-life applications. The proposed evaluation method provides guidelines for future researcher in the field. Based on the research findings, a mobile application was developed for physiological and emotional data collection in the wild. Based on appraisal theory, this application provides guidance to users to provide valuable emotion labelling and help them differentiate moods from emotions. A sample dataset collected using this application was compared to one collected using a paper-based preliminary study. The dataset collected using the mobile application was found to provide a more valuable dataset with data consistent with literature. This mobile application was used to create an open-source affect-related physiological signals database. While the path toward emotion detection usable in real-life application is still long, we hope that the tools provided to the research community will represent a step toward achieving this goal in the future. Automatically detecting emotion could not only be used for LIS patients to communicate but also for total-LIS patients who have lost their ability to move their eyes. Indeed, giving the ability to family and caregiver to visualize and therefore understand the patients' emotional state could greatly improve their quality of life. This research provided tools to LIS patients and the scientific community to improve augmentative and alternative communication, technologies with better interfaces, emotion expression capabilities and real-life emotion detection. Emotion recognition methods for real-life applications could not only enhance health care but also robotics, domotics and many other fields of study. A complete system fully gaze-controlled was made available open-source with all the developed solutions for LIS patients. This is expected to enhance their daily lives by improving their communication and by facilitating the development of novel assistive systems capabilities

    Semi-aural Interfaces: Investigating Voice-controlled Aural Flows

    Get PDF
    To support mobile, eyes-free web browsing, users can listen to ‘playlists’ of web content— aural flows . Interacting with aural flows, however, requires users to select interface buttons, tethering visual attention to the mobile device even when it is unsafe (e.g. while walking). This research extends the interaction with aural flows through simulated voice commands as a way to reduce visual interaction. This paper presents the findings of a study with 20 participants who browsed aural flows either through a visual interface only or by augmenting it with voice commands. Results suggest that using voice commands reduced the time spent looking at the device by half but yielded similar system usability and cognitive effort ratings as using buttons. Overall, the low-cognitive effort engendered by aural flows, regardless of the interaction modality, allowed participants to do more non-instructed (e.g. looking at the surrounding environment) than instructed activities (e.g. focusing on the user interface)

    A Study of User's Performance and Satisfaction on the Web Based Photo Annotation with Speech Interaction

    Get PDF
    This paper reports on empirical evaluation study of users' performance and satisfaction with prototype of Web Based speech photo annotation with speech interaction. Participants involved consist of Johor Bahru citizens from various background. They have completed two parts of annotation task; part A involving PhotoASys; photo annotation system with proposed speech interaction and part B involving Microsoft Microsoft Vista Speech Interaction style. They have completed eight tasks for each part including system login and selection of album and photos. Users' performance was recorded using computer screen recording software. Data were captured on the task completion time and subjective satisfaction. Participants need to complete a questionnaire on the subjective satisfaction when the task was completed. The performance data show the comparison between proposed speech interaction and Microsoft Vista Speech interaction applied in photo annotation system, PhotoASys. On average, the reduction in annotation performance time due to using proposed speech interaction style was 64.72% rather than using speech interaction Microsoft Vista style. Data analysis were showed in different statistical significant in annotation performance and subjective satisfaction for both styles of interaction. These results could be used for the next design in related software which involves personal belonging management.Comment: IEEE Publication Format, https://sites.google.com/site/journalofcomputing

    Intuitive human-device interaction for video control and feedback

    Get PDF

    Visualization Based on Geographic Information in Augmented Reality

    Get PDF
    corecore