33 research outputs found
The Impact of Word, Multiple Word, and Sentence Input on Virtual Keyboard Decoding Performance
Entering text on non-desktop computing devices is often done
via an onscreen virtual keyboard. Input on such keyboards
normally consists of a sequence of noisy tap events that specify
some amount of text, most commonly a single word. But
is single word-at-a-time entry the best choice? This paper
compares user performance and recognition accuracy of wordat-
a-time, phrase-at-a-time, and sentence-at-a-time text entry
on a smartwatch keyboard. We evaluate the impact of differing
amounts of input in both text copy and free composition tasks.
We found providing input of an entire sentence significantly
improved entry rates from 26wpm to 32wpm while keeping
character error rates below 4%. In offline experiments with
more processing power and memory, sentence input was recognized
with a much lower 2.0% error rate. Our findings suggest
virtual keyboards can enhance performance by encouraging
users to provide more input per recognition event.This work was supported by Google Faculty awards (K.V. and
P.O.K.
Statistical Language Modelling
Grammar-based natural language processing has reached a level where it can `understand' language to a limited degree in restricted domains. For example, it is possible to parse textual material very accurately and assign semantic relations to parts of sentences. An alternative approach originates from the work of Shannon over half a century ago [41], [42]. This approach assigns probabilities to linguistic events, where mathematical models are used to represent statistical knowledge. Once models are built, we decide which event is more likely than the others according to their probabilities. Although statistical methods currently use a very impoverished representation of speech and language (typically finite state), it is possible to train the underlying models from large amounts of data. Importantly, such statistical approaches often produce useful results. Statistical approaches seem especially well-suited to spoken language which is often spontaneous or conversational and not readily amenable to standard grammar-based approaches
Are Morphosyntactic Taggers Suitable to Improve Automatic Transcription?
International audienceThe aim of our paper is to study the interest of part of speech (POS) tagging to improve speech recognition. We first evaluate the part of misrecognized words that can be corrected using POS information; the analysis of a short extract of French radio broadcast news shows that an absolute decrease of the word error rate by 1.1% can be expected. We also demonstrate quantitatively that traditional POS taggers are reliable when applied to spoken corpus, including automatic transcriptions. This new result enables us to effectively use POS tag knowledge to improve, in a postprocessing stage, the quality of transcriptions, especially correcting agreement errors
A Rational Model of Word Skipping in Reading: Ideal Integration of Visual and Linguistic Information
During reading, readers intentionally do not fixate a wordwhen highly confident in its identity. In a rational model ofreading, word skipping decisions should be complex functionsof the particular word, linguistic context, and visual informa-tion available. In contrast, simple heuristic of reading onlypredicts additive effects of word and context features. Here wetest these predictions by implementing a rational model withBayesian inference, and predicting human skipping with theentropy of this modelâs posterior distribution. Results showeda significant effect of the entropy in predicting skipping abovea strong baseline model including word and context features.This pattern held for entropy measures from rational modelswith a frequency prior but not from ones with a 5-gram prior.These results suggest complex interactions between visual in-put and linguistic knowledge as predicted by the rational modelof reading, and a dominant role of frequency in making skip-ping decisions