2,352 research outputs found

    Looking Beyond a Clever Narrative: Visual Context and Attention are Primary Drivers of Affect in Video Advertisements

    Full text link
    Emotion evoked by an advertisement plays a key role in influencing brand recall and eventual consumer choices. Automatic ad affect recognition has several useful applications. However, the use of content-based feature representations does not give insights into how affect is modulated by aspects such as the ad scene setting, salient object attributes and their interactions. Neither do such approaches inform us on how humans prioritize visual information for ad understanding. Our work addresses these lacunae by decomposing video content into detected objects, coarse scene structure, object statistics and actively attended objects identified via eye-gaze. We measure the importance of each of these information channels by systematically incorporating related information into ad affect prediction models. Contrary to the popular notion that ad affect hinges on the narrative and the clever use of linguistic and social cues, we find that actively attended objects and the coarse scene structure better encode affective information as compared to individual scene objects or conspicuous background elements.Comment: Accepted for publication in the Proceedings of 20th ACM International Conference on Multimodal Interaction, Boulder, CO, US

    Speaker-following Video Subtitles

    Full text link
    We propose a new method for improving the presentation of subtitles in video (e.g. TV and movies). With conventional subtitles, the viewer has to constantly look away from the main viewing area to read the subtitles at the bottom of the screen, which disrupts the viewing experience and causes unnecessary eyestrain. Our method places on-screen subtitles next to the respective speakers to allow the viewer to follow the visual content while simultaneously reading the subtitles. We use novel identification algorithms to detect the speakers based on audio and visual information. Then the placement of the subtitles is determined using global optimization. A comprehensive usability study indicated that our subtitle placement method outperformed both conventional fixed-position subtitling and another previous dynamic subtitling method in terms of enhancing the overall viewing experience and reducing eyestrain

    AR Comic Chat

    Get PDF
    Live speech transcription and captioning are important for the accessibility of deaf and hard of hearing individuals, especially in situations with no visible ASL translators. If live captioning is available at all, it is typically rendered in the style of closed captions on a display such as a phone screen or TV and away from the real conversation. This can potentially divide the focus of the viewer and detract from the experience. This paper proposes an investigation into an alternative, Augmented Reality driven approach to the display of these captions, using deep neural networks to compute, track and associate deep visual and speech descriptors in order to maintain captions as speech bubbles above the speaker

    Interactive natural user interfaces

    Get PDF
    For many years, science fiction entertainment has showcased holographic technology and futuristic user interfaces that have stimulated the world\u27s imagination. Movies such as Star Wars and Minority Report portray characters interacting with free-floating 3D displays and manipulating virtual objects as though they were tangible. While these futuristic concepts are intriguing, it\u27s difficult to locate a commercial, interactive holographic video solution in an everyday electronics store. As used in this work, it should be noted that the term holography refers to artificially created, free-floating objects whereas the traditional term refers to the recording and reconstruction of 3D image data from 2D mediums. This research addresses the need for a feasible technological solution that allows users to work with projected, interactive and touch-sensitive 3D virtual environments. This research will aim to construct an interactive holographic user interface system by consolidating existing commodity hardware and interaction algorithms. In addition, this work studies the best design practices for human-centric factors related to 3D user interfaces. The problem of 3D user interfaces has been well-researched. When portrayed in science fiction, futuristic user interfaces usually consist of a holographic display, interaction controls and feedback mechanisms. In reality, holographic displays are usually represented by volumetric or multi-parallax technology. In this work, a novel holographic display is presented which leverages a mini-projector to produce a free-floating image onto a fog-like surface. The holographic user interface system will consist of a display component: to project a free-floating image; a tracking component: to allow the user to interact with the 3D display via gestures; and a software component: which drives the complete hardware system. After examining this research, readers will be well-informed on how to build an intuitive, eye-catching holographic user interface system for various application arenas

    Characterization of time-varying human operator dynamics

    Get PDF
    Human operator performance in tracking study of pilots determined by deterministic characterization theory and mathematical mode

    Toward A Theory of Media Reconciliation: An Exploratory Study of Closed Captioning

    Get PDF
    This project is an interdisciplinary empirical study that explores the emotional experiences resulting from the use of the assistive technology closed captioning. More specifically, this study focuses on documenting the user experiences of both the D/deaf and Hearing multimedia user in an effort to better identify and understand those variables and processes that are involved with facilitating and supporting connotative and emotional meaning making. There is an ever present gap that defines closed captioning studies thus far, and this gap is defined by the emphasis on understanding and measuring denotative meaning making behavior while largely ignoring connotative meaning making behavior that is necessarily an equal participant in a user\u27s viewing experience. This study explores connotative and emotional meaning making behaviors so as to better understand the behavior exhibited by users engaged with captioned multimedia. To that end, a mixed methods design was developed that utilizes qualitative methods from the field of User Experience (UX) to explore connotative equivalence between D/deaf and Hearing users and an augmented version of S. R. Gulliver and G. Ghinea\u27s (2003) quantitative measure Information Assimilation (IA) from the field of Human Computer Interaction (HCI) to measure the denotative equivalence between the two user types. To measure denotative equivalence a quiz containing open-ended questions to measure IA was used. To measure connotative equivalence the following measures were used: 1) Likert scales to measure users\u27 confidence in answers to open-ended questions. 2) Likert scale to measure a users\u27 interest in the stimulus. 3) Open - ended questions to identify scenes that elicited the strongest emotional responses from users. 4) Four- level response questions with accompanying Likert scales to determine strength of emotional reaction to three select excerpts from the stimulus. 5) An interview consisting of three open- ended questions and one fixed - choice question. This study found that there were no major differences in the denotative equivalence between the D/deaf and Hearing groups; however, there were important differences in the emotional reactions to the stimulus that indicate there was not connotative equivalence between the groups in response to the emotional content. More importantly, this study found that the strategies used to understand the information users were presented with in order to create both denotative and connotative meaning differed between groups and individuals within groups. To explain such behaviors observed, this work offers a theory of Media Reconciliation based on Wolfgang Iser\u27s (1980) phenomenological theory about the \u27virtual text\u27

    Human Visual Perception, study and applications to understanding Images and Videos

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH
    corecore