2,528 research outputs found

    The Impact of Word, Multiple Word, and Sentence Input on Virtual Keyboard Decoding Performance

    Get PDF
    Entering text on non-desktop computing devices is often done via an onscreen virtual keyboard. Input on such keyboards normally consists of a sequence of noisy tap events that specify some amount of text, most commonly a single word. But is single word-at-a-time entry the best choice? This paper compares user performance and recognition accuracy of wordat- a-time, phrase-at-a-time, and sentence-at-a-time text entry on a smartwatch keyboard. We evaluate the impact of differing amounts of input in both text copy and free composition tasks. We found providing input of an entire sentence significantly improved entry rates from 26wpm to 32wpm while keeping character error rates below 4%. In offline experiments with more processing power and memory, sentence input was recognized with a much lower 2.0% error rate. Our findings suggest virtual keyboards can enhance performance by encouraging users to provide more input per recognition event.This work was supported by Google Faculty awards (K.V. and P.O.K.

    VelociWatch: Designing and evaluating a virtual keyboard for the input of challenging text

    Get PDF
    © 2019 Association for Computing Machinery. Virtual keyboard typing is typically aided by an auto-correct method that decodes a user’s noisy taps into their intended text. This decoding process can reduce error rates and possibly increase entry rates by allowing users to type faster but less precisely. However, virtual keyboard decoders sometimes make mistakes that change a user’s desired word into another. This is particularly problematic for challenging text such as proper names. We investigate whether users can guess words that are likely to cause auto-correct problems and whether users can adjust their behavior to assist the decoder. We conduct computational experiments to decide what predictions to ofer in a virtual keyboard and design a smartwatch keyboard named VelociWatch. Novice users were able to use the features of VelociWatch to enter challenging text at 17 words-per-minute with a corrected error rate of 3%. Interestingly, they wrote slightly faster and just as accurately on a simpler keyboard with limited correction options. Our fnding suggest users may be able to type dif-fcult words on a smartwatch simply by tapping precisely without the use of auto-correct

    Fast and precise touch-based text entry for head-mounted augmented reality with variable occlusion

    Get PDF
    We present the VISAR keyboard: An augmented reality (AR) head-mounted display (HMD) system that supports text entry via a virtualised input surface. Users select keys on the virtual keyboard by imitating the process of single-hand typing on a physical touchscreen display. Our system uses a statistical decoder to infer users’ intended text and to provide error-tolerant predictions. There is also a high-precision fall-back mechanism to support users in indicating which keys should be unmodified by the auto-correction process. A unique advantage of leveraging the well-established touch input paradigm is that our system enables text entry with minimal visual clutter on the see-through display, thus preserving the user’s field-of-view. We iteratively designed and evaluated our system and show that the final iteration of the system supports a mean entry rate of 17.75wpm with a mean character error rate less than 1%. This performance represents a 19.6% improvement relative to the state-of-the-art baseline investigated: A gaze-then-gesture text entry technique derived from the system keyboard on the Microsoft HoloLens. Finally, we validate that the system is effective in supporting text entry in a fully mobile usage scenario likely to be encountered in industrial applications of AR HMDs.Per Ola Kristensson was supported in part by a Google Faculty research award and EPSRC grants EP/N010558/1 and EP/N014278/1. Keith Vertanen was supported in part by a Google Faculty research award. John Dudley was supported by the Trimble Fund

    Adapting End-to-End Speech Recognition for Readable Subtitles

    Full text link
    Automatic speech recognition (ASR) systems are primarily evaluated on transcription accuracy. However, in some use cases such as subtitling, verbatim transcription would reduce output readability given limited screen size and reading time. Therefore, this work focuses on ASR with output compression, a task challenging for supervised approaches due to the scarcity of training data. We first investigate a cascaded system, where an unsupervised compression model is used to post-edit the transcribed speech. We then compare several methods of end-to-end speech recognition under output length constraints. The experiments show that with limited data far less than needed for training a model from scratch, we can adapt a Transformer-based ASR model to incorporate both transcription and compression capabilities. Furthermore, the best performance in terms of WER and ROUGE scores is achieved by explicitly modeling the length constraints within the end-to-end ASR system.Comment: IWSLT 202

    TapGazer:Text Entry with finger tapping and gaze-directed word selection

    Get PDF
    • …
    corecore