189 research outputs found

    Multimedia search without visual analysis: the value of linguistic and contextual information

    Get PDF
    This paper addresses the focus of this special issue by analyzing the potential contribution of linguistic content and other non-image aspects to the processing of audiovisual data. It summarizes the various ways in which linguistic content analysis contributes to enhancing the semantic annotation of multimedia content, and, as a consequence, to improving the effectiveness of conceptual media access tools. A number of techniques are presented, including the time-alignment of textual resources, audio and speech processing, content reduction and reasoning tools, and the exploitation of surface features

    Journalistic image access : description, categorization and searching

    Get PDF
    The quantity of digital imagery continues to grow, creating a pressing need to develop efficient methods for organizing and retrieving images. Knowledge on user behavior in image description and search is required for creating effective and satisfying searching experiences. The nature of visual information and journalistic images creates challenges in representing and matching images with user needs. The goal of this dissertation was to understand the processes in journalistic image access (description, categorization, and searching), and the effects of contextual factors on preferred access points. These were studied using multiple data collection and analysis methods across several studies. Image attributes used to describe journalistic imagery were analyzed based on description tasks and compared to a typology developed through a meta-analysis of literature on image attributes. Journalistic image search processes and query types were analyzed through a field study and multimodal image retrieval experiment. Image categorization was studied via sorting experiments leading to a categorization model. Advances to research methods concerning search tasks and categorization procedures were implemented. Contextual effects on image access were found related to organizational contexts, work, and search tasks, as well as publication context. Image retrieval in a journalistic work context was contextual at the level of image needs and search process. While text queries, together with browsing, remained the key access mode to journalistic imagery, participants also used visual access modes in the experiment, constructing multimodal queries. Assigned search task type and searcher expertise had an effect on query modes utilized. Journalistic images were mostly described and queried for on the semantic level but also syntactic attributes were used. Constraining the description led to more abstract descriptions. Image similarity was evaluated mainly based on generic semantics. However, functionally oriented categories were also constructed, especially by domain experts. Availability of page context promoted thematic rather than object-based categorization. The findings increase our understanding of user behavior in image description, categorization, and searching, as well as have implications for future solutions in journalistic image access. The contexts of image production, use, and search merit more interest in research as these could be leveraged for supporting annotation and retrieval. Multiple access points should be created for journalistic images based on image content and function. Support for multimodal query formulation should also be offered. The contributions of this dissertation may be used to create evaluation criteria for journalistic image access systems

    The Accessibility of Mathematical Notation on the Web and Beyond

    Get PDF
    This paper serves two purposes. First, it offers an overview of the role of the Mathematical Markup Language (MathML) in representing mathematical notation on the Web, and its significance for accessibility. To orient the discussion, hypotheses are advanced regarding users’ needs in connection with the accessibility of mathematical notation. Second, current developments in the evolution of MathML are reviewed, noting their consequences for accessibility, and commenting on prospects for future improvement in the concrete experiences of users of assistive technologies. Recommendations are advanced for further research and development activities, emphasizing the cognitive aspects of user interface design

    Feeling what you hear: tactile feedback for navigation of audio graphs

    Get PDF
    Access to digitally stored numerical data is currently very limited for sight impaired people. Graphs and visualizations are often used to analyze relationships between numerical data, but the current methods of accessing them are highly visually mediated. Representing data using audio feedback is a common method of making data more accessible, but methods of navigating and accessing the data are often serial in nature and laborious. Tactile or haptic displays could be used to provide additional feedback to support a point-and-click type interaction for the visually impaired. A requirements capture conducted with sight impaired computer users produced a review of current accessibility technologies, and guidelines were extracted for using tactile feedback to aid navigation. The results of a qualitative evaluation with a prototype interface are also presented. Providing an absolute position input device and tactile feedback allowed the users to explore the graph using tactile and proprioceptive cues in a manner analogous to point-and-click techniques

    Human-powered smartphone assistance for blind people

    Get PDF
    Mobile devices are fundamental tools for inclusion and independence. Yet, there are still many open research issues in smartphone accessibility for blind people (Grussenmeyer and Folmer 2017). Currently, learning how to use a smartphone is non-trivial, especially when we consider that the need to learn new apps and accommodate to updates never ceases. When first transitioning from a basic feature-phone, people have to adapt to new paradigms of interaction. Where feature phones had a finite set of applications and functions, users can extend the possible functions and uses of a smartphone by installing new 3rd party applications. Moreover, the interconnectivity of these applications means that users can explore a seemingly endless set of workflows across applications. To that end, the fragmented nature of development on these devices results in users needing to create different mental models for each application. These characteristics make smartphone adoption a demanding task, as we found from our eight-week longitudinal study on smartphone adoption by blind people. We conducted multiple studies to characterize the smartphone challenges that blind people face, and found people often require synchronous, co-located assistance from family, peers, friends, and even strangers to overcome the different barriers they face. However, help is not always available, especially when we consider the disparity in each barrier, individual support network and current location. In this dissertation we investigated if and how in-context human-powered solutions can be leveraged to improve current smartphone accessibility and ease of use. Building on a comprehensive knowledge of the smartphone challenges faced and coping mechanisms employed by blind people, we explored how human-powered assistive technologies can facilitate use. The thesis of this dissertation is: Human-powered smartphone assistance by non-experts is effective and impacts perceptions of self-efficacy

    User Intent Communication in Robot-Assisted Shopping for the Blind

    Get PDF
    The research reported in this chapter describes our work on robot-assisted shopping for the blind. In our previous research, we developed RoboCart, a robotic shopping cart for the visually impaired (Gharpure, 2008; Kulyukin et al., 2008; Kulyukin et al., 2005). RoboCart's operation includes four steps: 1) the blind shopper (henceforth the shopper) selects

    Personalizable edge services for Web accessibility

    Get PDF
    Web Content Accessibility guidelines by W3C (W3C Recommendation, May 1999. http://www.w3.org/TR/WCAG10/) provide several suggestions for Web designers regarding how to author Web pages in order to make them accessible to everyone. In this context, this paper proposes the use of edge services as an efficient and general solution to promote accessibility and breaking down the digital barriers that inhibit users with disabilities to actively participate to any aspect of society. The idea behind edge services mainly affect the advantages of a personalized navigation in which contents are tailored according to different issues, such as client’s devices capabilities, communication systems and network conditions and, finally, preferences and/or abilities of the growing number of users that access the Web. To meet these requirements, Web designers have to efficiently provide content adaptation and personalization functionalities mechanisms in order to guarantee universal access to the Internet content. The so far dominant paradigm of communication on the WWW, due to its simple request/response model, cannot efficiently address such requirements. Therefore, it must be augmented with new components that attempt to enhance the scalability, the performances and the ubiquity of the Web. Edge servers, acting on the HTTP data flow exchanged between client and server, allow on-the-fly content adaptation as well as other complex functionalities beyond the traditional caching and content replication services. These value-added services are called edge services and include personalization and customization, aggregation from multiple sources, geographical personalization of the navigation of pages (with insertion/emphasis of content that can be related to the user’s geographical location), translation services, group navigation and awareness for social navigation, advanced services for bandwidth optimization such as adaptive compression and format transcoding, mobility, and ubiquitous access to Internet content. This paper presents Personalizable Accessible Navigation (Pan) that is a set of edge services designed to improve Web pages accessibility, developed and deployed on top of a programmable intermediary framework. The characteristics and the location of the services, i.e., provided by intermediaries, as well as the personalization and the opportunities to select multiple profiles make Pan a platform that is especially suitable for accessing the Web seamlessly also from mobile terminals

    Rotate-and-Press: A Non-Visual Alternative to Point-and-Click

    Get PDF
    Most computer applications manifest visually rich and dense graphical user interfaces (GUIs) that are primarily tailored for an easy-and-efficient sighted interaction using a combination of two default input modalities, namely the keyboard and the mouse/touchpad. However, blind screen-reader users predominantly rely only on keyboard, and therefore struggle to interact with these applications, since it is both arduous and tedious to perform the visual \u27point-and-click\u27 tasks such as accessing the various application commands/features using just keyboard shortcuts supported by screen readers. In this paper, we investigate the suitability of a \u27rotate-and-press\u27 input modality as an effective non-visual substitute for the visual mouse to easily interact with computer applications, with specific focus on word processing applications serving as the representative case study. In this regard, we designed and developed bTunes, an add-on for Microsoft Word that customizes an off-the-shelf Dial input device such that it serves as a surrogate mouse for blind screen-reader users to quickly access various application commands and features using a set of simple rotate and press gestures supported by the Dial. Therefore, with bTunes, blind users too can now enjoy the benefits of two input modalities, as their sighted counterparts. A user study with 15 blind participants revealed that bTunes significantly reduced both the time and number of user actions for doing representative tasks in a word processing application, by as much as 65.1% and 36.09% respectively. The participants also stated that they did not face any issues switching between keyboard and Dial, and furthermore gave a high usability rating (84.66 avg. SUS score) for bTunes

    Using Sonic Enhancement to Augment Non-Visual Tabular Navigation

    Get PDF
    More information is now readily available to computer users than at any time in human history; however, much of this information is often inaccessible to people with blindness or low-vision, for whom information must be presented non-visually. Currently, screen readers are able to verbalize on-screen text using text-to-speech (TTS) synthesis; however, much of this vocalization is inadequate for browsing the Internet. An auditory interface that incorporates auditory-spatial orientation was created and tested. For information that can be structured as a two-dimensional table, links can be semantically grouped as cells in a row within an auditory table, which provides a consistent structure for auditory navigation. An auditory display prototype was tested. Sixteen legally blind subjects participated in this research study. Results demonstrated that stereo panning was an effective technique for audio-spatially orienting non-visual navigation in a five-row, six-column HTML table as compared to a centered, stationary synthesized voice. These results were based on measuring the time- to-target (TTT), or the amount of time elapsed from the first prompting to the selection of each tabular link. Preliminary analysis of the TTT values recorded during the experiment showed that the populations did not conform to the ANOVA requirements of normality and equality of variances. Therefore, the data were transformed using the natural logarithm. The repeated-measures two-factor ANOVA results show that the logarithmically-transformed TTTs were significantly affected by the tonal variation method, F(1,15) = 6.194, p= 0.025. Similarly, the results show that the logarithmically transformed TTTs were marginally affected by the stereo spatialization method, F(1,15) = 4.240, p=0.057. The results show that the logarithmically transformed TTTs were not significantly affected by the interaction of both methods, F(1,15) = 1.381, p=0.258. These results suggest that some confusion may be caused in the subject when employing both of these methods simultaneously. The significant effect of tonal variation indicates that the effect is actually increasing the average TTT. In other words, the presence of preceding tones increases task completion time on average. The marginally-significant effect of stereo spatialization decreases the average log(TTT) from 2.405 to 2.264
    • 

    corecore