78 research outputs found

    Intelligent Techniques to Accelerate Everyday Text Communication

    Get PDF
    People with some form of speech- or motor-impairments usually use a high-tech augmentative and alternative communication (AAC) device to communicate with other people in writing or in face-to-face conversations. Their text entry rate on these devices is slow due to their motor abilities. Making good letter or word predictions can help accelerate the communication of such users. In this dissertation, we investigated several approaches to accelerate input for AAC users. First, considering that an AAC user is participating in a face-to-face conversation, we investigated whether performing speech recognition on the speaking-side can improve next word predictions. We compared the accuracy of three plausible microphone deployment options and the accuracy of two commercial speech recognition engines. We found that despite recognition word error rates of 7-16%, our ensemble of n-gram and recurrent neural network language models made predictions nearly as good as when they used the reference transcripts. In a user study with 160 participants, we also found that increasing number of prediction slots in a keyboard interface does not necessarily correlate to improved performance. Second, typing every character in a text message may require an AAC user more time or effort than strictly necessary. Skipping spaces or other characters may be able to speed input and reduce an AAC user\u27s physical input effort. We designed a recognizer optimized for expanding noisy abbreviated input where users often omitted spaces and mid-word vowels. We showed using neural language models for selecting conversational-style training text and for rescoring the recognizer\u27s n-best sentences improved accuracy. We found accurate abbreviated input was possible even if a third of characters was omitted. In a study where users had to dwell for a second on each key, we found sentence abbreviated input was competitive with a conventional keyboard with word predictions. Finally, AAC keyboards rely on language modeling to auto-correct noisy typing and to offer word predictions. While today language models can be trained on huge amounts of text, pre-trained models may fail to capture the unique writing style and vocabulary of individual users. We demonstrated improved performance compared to a unigram cache by adapting to a user\u27s text via language models based on prediction by partial match (PPM) and recurrent neural networks. Our best model ensemble increased keystroke savings by 9.6%

    Character Recognition

    Get PDF
    Character recognition is one of the pattern recognition technologies that are most widely used in practical applications. This book presents recent advances that are relevant to character recognition, from technical topics such as image processing, feature extraction or classification, to new applications including human-computer interfaces. The goal of this book is to provide a reference source for academic research and for professionals working in the character recognition field

    Cerebral palsy, online social networks and change

    Get PDF
    In 2011, 19.2 million households in the United Kingdom had access to the Internet. Online social networks (OSN) such as Facebook, Twitter, MySpace, Bebo and YouTube have proved to be the most popular Internet activity (Office of National Statistics, 2011). 49% of these users have updated or created an OSN profile and are making over 24 million visits a month (Dutton, 2009). These websites are often directed at a broad market i.e. people without disabilities. Unfortunately people with disabilities, especially those with physical impairments, often have a greater risk of experiencing loneliness than people without a disability as a result of their mobility, access and or communication impairments. Conventional communication methods such as face-to-face communication, telephone communication and text message communication are often difficult to use and can limit the opportunities for people with disabilities to engage in successful socialisation with family members and friends (Braithwaiteet al, 1999). Therefore people with disabilities can often see online communication, especially OSNs, as an attractive alternative. Previous studies such as Braithwaite et al(1999), Ellis and Kent (2010) and Dobransky and Hargittai (2006) suggests that OSNs are opening a new world to individuals with disabilities. They help these individuals, especially those exhibiting lifelong physical challenges to carry out social interaction which they would otherwise not be able to do within the analogue world. However due to inaccessible features presented in the technology for example features requiring JavaScript, hard-coded text size and Captcha (AbilityNet, 2008; Cahill and Hollier, 2009 andAsuncion, 2010) access to OSNs is often difficult. The overarching purpose of this PhD research is to understand the experiences and challenges faced when people with the physical disability cerebral palsy (cp) use OSNs. It is estimated that 1 in 400 children born in the UK is affected by cp (Scope Response, 2007). The disability can present itself in a variety of ways and to varying degrees. There is no cure for cp, however management to increase social interaction especially through technological innovations is often encouraged (United Cerebral Palsy, 2001; Sharan, 2005 and Colledge, 2006). Previous studies such as AbilityNet (2008), Cahill and Hollier (2009), and Boudreau (2011) have explored mainstream OSNs use from the perspective of users with disabilities, i.e. blind and visually or cognitively impaired, but have placed great emphasis on investigating inaccessibility of OSNs without involving these users. Other studies such as Manna (2005) and Belchiorb et al (2005) have used statistical methods such as surveys and questionnaires to identify Internet use among people with unspecified disabilities. Conversely Asuncion (2010) has taken a broader approach involving OSN users using high-level taxonomies to classify their disabilities, and Marshall et al (2006) focused on a specific disability type, cognitive impairments, without considering the variety of limitations present within the disability. Other studies such as Pell (1999) have taken a broader yet more specific approach and looked at technology use, especially computer and assistive technology among people with physical disabilities, where only 7 out of 82 surveyed had cp. Whereas Braithwaiteet al (1999) focused on individuals with disabilities, where most were classified has having a physical disability. However the study does not explicitly look at OSNs but rather at online social support within forums for people with disabilities. Studies such as these have not involved the users; defined what constitutes disability or focused on cp without encompassing other disabilities, making it impossible to identify the requirements of OSN users with cp. Initially this PhD research explored the experiences and challenges faced when individuals with cp use OSNs. Fourteen interviews were carried out consisting of participants with variations of the disability. The study identified the reasons for OSN use and non-use and also discovered key themes together with challenges that affected their experiences. This work was followed by an in-context observational study that examined these individuals context of use. The study identified the OSNs and assistive technology used, tasks carried out and users feelings during interaction. As a result of these studies it was determined that changing OSNs prevented and or slowed down these users ability to communicate online. Previous work within human-computer interaction and other disciplines such as software engineering and management science, change is often discussed during software development and is restricted to identifying scenarios and tools that assist change management within information technology (Jarke and Kurkisuonio, 1998). Studies such as these have not considered change deployment or its affect on users, though within HCI such an understanding is limited. Other disciplines i.e. psychology and social sciences have looked at change deployment. Theorists such as Lewin (1952), Lippett (1958) and Griffith (2001) attempt to offer solutions. However no one theory or approach is widely accepted and contradictions, adaptations and exclusions are continually being made. Conversely Woodward and Hendry (2004) and By (2007) have attempted to contend with these difficulties specifically stress as a result of change, believing that if change agents are aware of what an affected individual is thinking during the on set of change it will help to minimise or prevent damage. Studies such as these have focused on software development or organisational change from the perspective of developers or employees, they have not considered OSNs or individuals with cp. To fill this gap a longitudinal OSN monitoring and analysis study was carried out. The study identified how OSN changes are introduced, their affect on users, and the factors that encourage change acceptance or non-acceptance. The study was divided into three studies: two studies investigating realworld examples of OSN change by observing the actions of change agents (Twitter.com and Facebook.com) and their users reactions to the change process. A third study that asked OSN users about their experiences of OSN change was also carried out. A by product of these studies was a unique way of displaying OSN change and user acceptance on a large scale using a infographic and an inductive category model that can be used to examine OSN change. The findings from the five studies were then distilled alongside identified change management approaches and theories to develop an five-stage process for OSN change for change agents to follow. The process defined the requirements for OSN change including the change agent responsibilities before, during and after the change.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

    Quality of experience in digital mobile multimedia services

    Get PDF
    People like to consume multimedia content on mobile devices. Mobile networks can deliver mobile TV services but they require large infrastructural investments and their operators need to make trade-offs to design worthwhile experiences. The approximation of how users experience networked services has shifted from the inadequate packet level Quality of Service (QoS) to the user perceived Quality of Experience (QoE) that includes content, user context and their expectations. However, QoE is lacking concrete operationalizations for the visual experience of content on small, sub-TV resolution screens displaying transcoded TV content at low bitrates. The contribution of my thesis includes both substantive and methodological results on which factors contribute to the QoE in mobile multimedia services and how. I utilised a mix of methods in both lab and field settings to assess the visual experience of multimedia content on mobile devices. This included qualitative elicitation techniques such as 14 focus groups and 75 hours of debrief interviews in six experimental studies. 343 participants watched 140 hours of realistic TV content and provided feedback through quantitative measures such as acceptability, preferences and eye-tracking. My substantive findings on the effects of size, resolution, text quality and shot types can improve multimedia models. My substantive findings show that people want to watch mobile TV at a relative size (at least 4cm of screen height) similar to living room TV setups. In order to achieve these sizes at 35cm viewing distance users require at least QCIF resolution and are willing to scale it to a much lower angular resolution (12ppd) then what video quality research has found to be the best visual quality (35ppd). My methodological findings suggest that future multimedia QoE research should use a mixed methods approach including qualitative feedback and viewing ratios akin to living room setups to meet QoE’s ambitious scope

    Clique: Perceptually Based, Task Oriented Auditory Display for GUI Applications

    Get PDF
    Screen reading is the prevalent approach for presenting graphical desktop applications in audio. The primary function of a screen reader is to describe what the user encounters when interacting with a graphical user interface (GUI). This straightforward method allows people with visual impairments to hear exactly what is on the screen, but with significant usability problems in a multitasking environment. Screen reader users must infer the state of on-going tasks spanning multiple graphical windows from a single, serial stream of speech. In this dissertation, I explore a new approach to enabling auditory display of GUI programs. With this method, the display describes concurrent application tasks using a small set of simultaneous speech and sound streams. The user listens to and interacts solely with this display, never with the underlying graphical interfaces. Scripts support this level of adaption by mapping GUI components to task definitions. Evaluation of this approach shows improvements in user efficiency, satisfaction, and understanding with little development effort. To develop this method, I studied the literature on existing auditory displays, working user behavior, and theories of human auditory perception and processing. I then conducted a user study to observe problems encountered and techniques employed by users interacting with an ideal auditory display: another human being. Based on my findings, I designed and implemented a prototype auditory display, called Clique, along with scripts adapting seven GUI applications. I concluded my work by conducting a variety of evaluations on Clique. The results of these studies show the following benefits of Clique over the state of the art for users with visual impairments (1-5) and mobile sighted users (6): 1. Faster, accurate access to speech utterances through concurrent speech streams. 2. Better awareness of peripheral information via concurrent speech and sound streams. 3. Increased information bandwidth through concurrent streams. 4. More efficient information seeking enabled by ubiquitous tools for browsing and searching. 5. Greater accuracy in describing unfamiliar applications learned using a consistent, task-based user interface. 6. Faster completion of email tasks in a standard GUI after exposure to those tasks in audio

    Augmentative communication device design, implementation and evaluation

    Get PDF
    The ultimate aim of this thesis was to design and implement an advanced software based Augmentative Communication Device (ACD) , or Voice Output Communication Aid NOCA), for non-vocal Learning Disabled individuals by applying current psychological models, theories, and experimental techniques. By taking account of potential user's cognitive and linguistic abilities a symbol based device (Easy Speaker) was produced which outputs naturalistic digitised human speech and sound and makes use of a photorealistic symbol set. In order to increase the size of the available symbol set a hypermedia style dynamic screen approach was employed. The relevance of the hypermedia metaphor in relation to models of knowledge representation and language processing was explored.Laboratory based studies suggested that potential user's could learn to productively operate the software, became faster and more efficient over time when performing set conversational tasks. Studies with unimpaired individuals supported the notion that digitised speech was less cognitively demanding to decode, or listen to.With highly portable, touch based, PC compatible systems beginning to appear it is hoped that the otherwise silent will be able to use the software as their primary means of communication with the speaking world. Extensive field trials over a six month period with a prototype device and in collaboration with user's caregivers strongly suggested this might be the case.Off-device improvements were also noted suggesting that Easy Speaker, or similar software has the potential to be used as a communication training tool. Such training would be likely 10 improve overall communicative effectiveness.To conclude, a model for successful ACD development was proposed

    Raspberry Pi Technology

    Get PDF
    • …
    corecore