207 research outputs found

    Crowdsourcing design guidance for contextual adaptation of text content in augmented reality

    Get PDF
    Funding Information: This work was supported by EPSRC (grants EP/R004471/1 and EP/S027432/1). Supporting data for this publication is available at https://doi.org/10.17863/CAM.62931.Augmented Reality (AR) can deliver engaging user experiences that seamlessly meld virtual content with the physical environment. However, building such experiences is challenging due to the developer's inability to assess how uncontrolled deployment contexts may infuence the user experience. To address this issue, we demonstrate a method for rapidly conducting AR experiments and real-world data collection in the user's own physical environment using a privacy-conscious mobile web application. The approach leverages the large number of distinct user contexts accessible through crowdsourcing to efciently source diverse context and perceptual preference data. The insights gathered through this method complement emerging design guidance and sample-limited lab-based studies. The utility of the method is illustrated by reexamining the design challenge of adapting AR text content to the user's environment. Finally, we demonstrate how gathered design insight can be operationalized to provide adaptive text content functionality in an AR headset.Publisher PD

    An Examination of Presentation Strategies for Textual Data in Augmented Reality

    Get PDF
    Videos with embedded text have been widely used in the past and the text in the videos usually contained valuable information. However, it was difficult for people to fully understand the text in videos displayed on smartphones due to obstructions such as color conflicts between letters and the moving background. Adjustments to texts that would support the human visual system, such as changes to brightness and color contrast, increased legibility of text, and taking into account the phantom illumination (PI) illusion (the optical illusion that increases the perception of brightness in a certain area), should be able to improve peoples’ ability to read text in augmented reality (AR) applications on smartphones. The researcher created a text presentation style implementing the PI illusion, using solid white text on a 50% transparent black billboard with a black-white shading PI illusion at the internal edge. An experiment was conducted to verify whether the text presentation style could improve reading performance. The experiment showed that the PI illusion was unable to improve legibility of text in AR applications on smartphones. However, the data suggested that, in some cases, certain participants, especially from some specific major groups, have difficulties text reading when the text is presented using the standard text presentation style without the enhancement of the PI illusion

    Toward Robust Video Event Detection and Retrieval Under Adversarial Constraints

    Get PDF
    The continuous stream of videos that are uploaded and shared on the Internet has been leveraged by computer vision researchers for a myriad of detection and retrieval tasks, including gesture detection, copy detection, face authentication, etc. However, the existing state-of-the-art event detection and retrieval techniques fail to deal with several real-world challenges (e.g., low resolution, low brightness and noise) under adversary constraints. This dissertation focuses on these challenges in realistic scenarios and demonstrates practical methods to address the problem of robustness and efficiency within video event detection and retrieval systems in five application settings (namely, CAPTCHA decoding, face liveness detection, reconstructing typed input on mobile devices, video confirmation attack, and content-based copy detection). Specifically, for CAPTCHA decoding, I propose an automated approach which can decode moving-image object recognition (MIOR) CAPTCHAs faster than humans. I showed that not only are there inherent weaknesses in current MIOR CAPTCHA designs, but that several obvious countermeasures (e.g., extending the length of the codeword) are not viable. More importantly, my work highlights the fact that the choice of underlying hard problem selected by the designers of a leading commercial solution falls into a solvable subclass of computer vision problems. For face liveness detection, I introduce a novel approach to bypass modern face authentication systems. More specifically, by leveraging a handful of pictures of the target user taken from social media, I show how to create realistic, textured, 3D facial models that undermine the security of widely used face authentication solutions. My framework makes use of virtual reality (VR) systems, incorporating along the way the ability to perform animations (e.g., raising an eyebrow or smiling) of the facial model, in order to trick liveness detectors into believing that the 3D model is a real human face. I demonstrate that such VR-based spoofing attacks constitute a fundamentally new class of attacks that point to a serious weaknesses in camera-based authentication systems. For reconstructing typed input on mobile devices, I proposed a method that successfully transcribes the text typed on a keyboard by exploiting video of the user typing, even from significant distances and from repeated reflections. This feat allows us to reconstruct typed input from the image of a mobile phone’s screen on a user’s eyeball as reflected through a nearby mirror, extending the privacy threat to include situations where the adversary is located around a corner from the user. To assess the viability of a video confirmation attack, I explored a technique that exploits the emanations of changes in light to reveal the programs being watched. I leverage the key insight that the observable emanations of a display (e.g., a TV or monitor) during presentation of the viewing content induces a distinctive flicker pattern that can be exploited by an adversary. My proposed approach works successfully in a number of practical scenarios, including (but not limited to) observations of light effusions through the windows, on the back wall, or off the victim’s face. My empirical results show that I can successfully confirm hypotheses while capturing short recordings (typically less than 4 minutes long) of the changes in brightness from the victim’s display from a distance of 70 meters. Lastly, for content-based copy detection, I take advantage of a new temporal feature to index a reference library in a manner that is robust to the popular spatial and temporal transformations in pirated videos. My technique narrows the detection gap in the important area of temporal transformations applied by would-be pirates. My large-scale evaluation on real-world data shows that I can successfully detect infringing content from movies and sports clips with 90.0% precision at a 71.1% recall rate, and can achieve that accuracy at an average time expense of merely 5.3 seconds, outperforming the state of the art by an order of magnitude.Doctor of Philosoph

    Understanding, Modeling, and Simulating the Discrepancy Between Intended and Perceived Image Appearance on Optical See-Through Augmented Reality Displays

    Get PDF
    Augmented reality (AR) displays are transitioning from being primarily used in research and development settings, to being used by the general public. With this transition, these displays will be used by more people, in many different environments, and in many different contexts. Like other displays, the user\u27s perception of virtual imagery is influenced by the characteristics of the user\u27s environment, creating a discrepancy between the intended appearance and the perceived appearance of virtual imagery shown on the display. However, this problem is much more apparent for optical see-through AR displays, such as the HoloLens. For these displays, imagery is superimposed onto the user\u27s view of their environment, which can cause the imagery to become transparent and washed out in appearance from the user\u27s perspective. Any change in the user\u27s environment conditions or in the user\u27s position introduces changes to the perceived appearance of the AR imagery, and current AR displays do not adapt to maintain a consistent perceived appearance of the imagery being displayed. Because of this, in many environments the user may misinterpret or fail to notice information shown on the display. In this dissertation, I investigate the factors that influence user perception of AR imagery and demonstrate examples of how the user\u27s perception is affected for applications involving user interfaces, attention cues, and virtual humans. I establish a mathematical model that relates the user, their environment, their AR display, and AR imagery in terms of luminance or illuminance contrast. I demonstrate how this model can be used to classify the user\u27s viewing conditions and identify problems the user is prone to experience when in these conditions. I demonstrate how the model can be used to simulate changes in the user\u27s viewing conditions and to identify methods to maintain the perceived appearance of the AR imagery in changing conditions
    • …
    corecore