33 research outputs found

    A Saliency-Driven Video Magnifier For People With Low Vision

    Get PDF
    Consuming video content poses significant challenges for many screen magnifier users, which is the “go to” assistive technology for people with low vision. While screen magnifier software could be used to achieve a zoom factor that would make the content of the video visible to low-vision users, it is oftentimes a major challenge for these users to navigate through videos. Towards making videos more accessible for low-vision users, we have developed the SViM video magnifier system [6]. Specifically, SViM consists of three different magnifier interfaces with easy-to-use means of interactions. All three interfaces are driven by visual saliency as a guided signal, which provides a quantification of interestingness at the pixel-level. Saliency information, which is provided as a heatmap is then processed to obtain distinct regions of interest. These regions of interests are tracked over time and displayed using an easy-to-use interface. We present a description of our overall design and interfaces

    Towards Making Videos Accessible for Low Vision Screen Magnifier Users

    Get PDF
    People with low vision who use screen magnifiers to interact with computing devices find it very challenging to interact with dynamically changing digital content such as videos, since they do not have the luxury of time to manually move, i.e., pan the magnifier lens to different regions of interest (ROIs) or zoom into these ROIs before the content changes across frames. In this paper, we present SViM, a first of its kind screen-magnifier interface for such users that leverages advances in computer vision, particularly video saliency models, to identify salient ROIs in videos. SViM\u27s interface allows users to zoom in/out of any point of interest, switch between ROIs via mouse clicks and provides assistive panning with the added flexibility that lets the user explore other regions of the video besides the ROIs identified by SViM. Subjective and objective evaluation of a user study with 13 low vision screen magnifier users revealed that overall the participants had a better user experience with SViM over extant screen magnifiers, indicative of the former\u27s promise and potential for making videos accessible to low vision screen magnifier users

    Eyewear Computing \u2013 Augmenting the Human with Head-Mounted Wearable Assistants

    Get PDF
    The seminar was composed of workshops and tutorials on head-mounted eye tracking, egocentric vision, optics, and head-mounted displays. The seminar welcomed 30 academic and industry researchers from Europe, the US, and Asia with a diverse background, including wearable and ubiquitous computing, computer vision, developmental psychology, optics, and human-computer interaction. In contrast to several previous Dagstuhl seminars, we used an ignite talk format to reduce the time of talks to one half-day and to leave the rest of the week for hands-on sessions, group work, general discussions, and socialising. The key results of this seminar are 1) the identification of key research challenges and summaries of breakout groups on multimodal eyewear computing, egocentric vision, security and privacy issues, skill augmentation and task guidance, eyewear computing for gaming, as well as prototyping of VR applications, 2) a list of datasets and research tools for eyewear computing, 3) three small-scale datasets recorded during the seminar, 4) an article in ACM Interactions entitled \u201cEyewear Computers for Human-Computer Interaction\u201d, as well as 5) two follow-up workshops on \u201cEgocentric Perception, Interaction, and Computing\u201d at the European Conference on Computer Vision (ECCV) as well as \u201cEyewear Computing\u201d at the ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp)

    AutoDesc: Facilitating Convenient Perusal of Web Data Items for Blind Users

    Get PDF
    Web data items such as shopping products, classifieds, and job listings are indispensable components of most e-commerce websites. The information on the data items are typically distributed over two or more webpages, e.g., a ‘Query-Results’ page showing the summaries of the items, and ‘Details’ pages containing full information about the items. While this organization of data mitigates information overload and visual cluttering for sighted users, it however increases the interaction overhead and effort for blind users, as back-and-forth navigation between webpages using screen reader assistive technology is tedious and cumbersome. Existing usability-enhancing solutions are unable to provide adequate support in this regard as they predominantly focus on enabling efficient content access within a single webpage, and as such are not tailored for content distributed across multiple webpages. As an initial step towards addressing this issue, we developed AutoDesc, a browser extension that leverages a custom extraction model to automatically detect and pull out additional item descriptions from the ‘details’ pages, and then proactively inject the extracted information into the ‘Query-Results’ page, thereby reducing the amount of back-and-forth screen reader navigation between the two webpages. In a study with 16 blind users, we observed that within the same time duration, the participants were able to peruse significantly more data items on average with AutoDesc, compared to that with their preferred screen readers as well as with a state-of-the-art solution

    Exploring new representations and applications for motion analysis

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 153-164).The focus of motion analysis has been on estimating a flow vector for every pixel by matching intensities. In my thesis, I will explore motion representations beyond the pixel level and new applications to which these representations lead. I first focus on analyzing motion from video sequences. Traditional motion analysis suffers from the inappropriate modeling of the grouping relationship of pixels and from a lack of ground-truth data. Using layers as the interface for humans to interact with videos, we build a human-assisted motion annotation system to obtain ground-truth motion, missing in the literature, for natural video sequences. Furthermore, we show that with the layer representation, we can detect and magnify small motions to make them visible to human eyes. Then we move to a contour presentation to analyze the motion for textureless objects under occlusion. We demonstrate that simultaneous boundary grouping and motion analysis can solve challenging data, where the traditional pixel-wise motion analysis fails. In the second part of my thesis, I will show the benefits of matching local image structures instead of intensity values. We propose SIFT flow that establishes dense, semantically meaningful correspondence between two images across scenes by matching pixel-wise SIFT features. Using SIFT flow, we develop a new framework for image parsing by transferring the metadata information, such as annotation, motion and depth, from the images in a large database to an unknown query image. We demonstrate this framework using new applications such as predicting motion from a single image and motion synthesis via object transfer.(cont.) Based on SIFT flow, we introduce a nonparametric scene parsing system using label transfer, with very promising experimental results suggesting that our system outperforms state-of-the-art techniques based on training classifiers.by Ce Liu.Ph.D

    Use Of Experts\u27 Gaze By Novice Usability Practitioners To Perform A Better Heuristic Evaluation

    Get PDF
    Proficiency in conducting heuristic evaluations does not come easily; it is an acquired skill that takes years to master. It is often difficult to convey an effective evaluation strategy through a verbal approach. While communicating verbally, people may prompt to where they focus their attention, but this is often difficult to convey. Through an eye tracking study, the relationship between an expert’s gaze while performing a task and a novice’s learning to better perform a heuristic evaluation will be explored. Novices concentrate on basic, but irrelevant parts of a task while processing complex stimuli whereas experts process stimuli quicker while focusing on relevant aspects. Finding a way to convey this to a novice would make a novice\u27s approach quicker and more efficient than before. It has already been shown in a couple of different domains that watching an expert’s gaze is useful to novices in performing certain tasks. Through this study, it will be shown that this method of knowledge transfer can be extended to the heuristic evaluation process

    The Efficacy of Visuomotor Compensatory Training for Individuals with Visual Field Defects

    Get PDF
    Several approaches have been developed to help patients with partial visual field defects to cope with their visual loss, and the most effective are those that encourage the person to move their eyes more efficiently. This thesis sought to examine the efficacy of a multiplatform compensatory training called Durham Reading and Exploration (DREX) in the rehabilitation of these individuals. Overall, the thesis focuses on two primary aims which include establishing whether the DREX training app completed on either a computer or a touchscreen tablet can be an effective treatment for homonymous visual field defects (HVFDs) caused by brain injury, as well as validating the assessment tasks that have been incorporated into the app. The results from Studies 1 to 3 show that DREX training is clinically effective for HVFD rehabilitation, and the training effect in patients trained using a touchscreen tablet is equivalent to patients trained with a computer, with a meaningful improvement in the quality of life which remains stable over a period of three months. In Studies 4 to 6, the built-in assessments tasks are found to be reliable and valid and can be used confidently to monitor the training progression and outcomes. Study 7 explores the novel observation that DREX training is also beneficial for patients with other types of partial visual field defects like tunnel vision and central visual field loss, demonstrating that this training could potentially be offered to a wider low vision population. Finally, studies 8 and 9 explore whether the blurring of vision, a common comorbid visual impairment in patients with visual field defect, could affect the visual exploration performance and the outcomes of visual exploration training. From these results it is clear that blurring of vision did reduce the search efficacy, but searching behaviour can still be improved with the training. Taken together, the findings from this suite of studies indicate that DREX is an effective and inexpensive treatment for visual field defects in a variety of etiologies, however the comorbid impairments that could affect the rehabilitation should be identified to maximise efficacy of this treatment

    Efficient image-based rendering

    Get PDF
    Recent advancements in real-time ray tracing and deep learning have significantly enhanced the realism of computer-generated images. However, conventional 3D computer graphics (CG) can still be time-consuming and resource-intensive, particularly when creating photo-realistic simulations of complex or animated scenes. Image-based rendering (IBR) has emerged as an alternative approach that utilizes pre-captured images from the real world to generate realistic images in real-time, eliminating the need for extensive modeling. Although IBR has its advantages, it faces challenges in providing the same level of control over scene attributes as traditional CG pipelines and accurately reproducing complex scenes and objects with different materials, such as transparent objects. This thesis endeavors to address these issues by harnessing the power of deep learning and incorporating the fundamental principles of graphics and physical-based rendering. It offers an efficient solution that enables interactive manipulation of real-world dynamic scenes captured from sparse views, lighting positions, and times, as well as a physically-based approach that facilitates accurate reproduction of the view dependency effect resulting from the interaction between transparent objects and their surrounding environment. Additionally, this thesis develops a visibility metric that can identify artifacts in the reconstructed IBR images without observing the reference image, thereby contributing to the design of an effective IBR acquisition pipeline. Lastly, a perception-driven rendering technique is developed to provide high-fidelity visual content in virtual reality displays while retaining computational efficiency.JĂŒngste Fortschritte im Bereich Echtzeit-Raytracing und Deep Learning haben den Realismus computergenerierter Bilder erheblich verbessert. Konventionelle 3DComputergrafik (CG) kann jedoch nach wie vor zeit- und ressourcenintensiv sein, insbesondere bei der Erstellung fotorealistischer Simulationen von komplexen oder animierten Szenen. Das bildbasierte Rendering (IBR) hat sich als alternativer Ansatz herauskristallisiert, bei dem vorab aufgenommene Bilder aus der realen Welt verwendet werden, um realistische Bilder in Echtzeit zu erzeugen, so dass keine umfangreiche Modellierung erforderlich ist. Obwohl IBR seine Vorteile hat, ist es eine Herausforderung, das gleiche Maß an Kontrolle ĂŒber Szenenattribute zu bieten wie traditionelle CG-Pipelines und komplexe Szenen und Objekte mit unterschiedlichen Materialien, wie z.B. transparente Objekte, akkurat wiederzugeben. In dieser Arbeit wird versucht, diese Probleme zu lösen, indem die Möglichkeiten des Deep Learning genutzt und die grundlegenden Prinzipien der Grafik und des physikalisch basierten Renderings einbezogen werden. Sie bietet eine effiziente Lösung, die eine interaktive Manipulation von dynamischen Szenen aus der realen Welt ermöglicht, die aus spĂ€rlichen Ansichten, Beleuchtungspositionen und Zeiten erfasst wurden, sowie einen physikalisch basierten Ansatz, der eine genaue Reproduktion des Effekts der SichtabhĂ€ngigkeit ermöglicht, der sich aus der Interaktion zwischen transparenten Objekten und ihrer Umgebung ergibt. DarĂŒber hinaus wird in dieser Arbeit eine Sichtbarkeitsmetrik entwickelt, mit der Artefakte in den rekonstruierten IBR-Bildern identifiziert werden können, ohne das Referenzbild zu betrachten, und die somit zur Entwicklung einer effektiven IBR-Erfassungspipeline beitrĂ€gt. Schließlich wird ein wahrnehmungsgesteuertes Rendering-Verfahren entwickelt, um visuelle Inhalte in Virtual-Reality-Displays mit hoherWiedergabetreue zu liefern und gleichzeitig die Rechenleistung zu erhalten
    corecore