187,211 research outputs found
Integrating Multiple 3D Views through Frame-of-reference Interaction
Frame-of-reference interaction consists of a unified set of 3D interaction techniques for exploratory navigation of large virtual spaces in nonimmersive environments. It is based on a conceptual framework that considers navigation from a cognitive perspective, as a way of facilitating changes in user attention from one reference frame to another, rather than from the mechanical perspective of moving a camera between different points of interest. All of our techniques link multiple frames of reference in some meaningful way. Some techniques link multiple windows within a zooming environment while others allow seamless changes of user focus between static objects, moving objects, and groups of moving objects. We present our techniques as they are implemented in GeoZui3D, a geographic visualization system for ocean data
Flow-Guided Feature Aggregation for Video Object Detection
Extending state-of-the-art object detectors from image to video is
challenging. The accuracy of detection suffers from degenerated object
appearances in videos, e.g., motion blur, video defocus, rare poses, etc.
Existing work attempts to exploit temporal information on box level, but such
methods are not trained end-to-end. We present flow-guided feature aggregation,
an accurate and end-to-end learning framework for video object detection. It
leverages temporal coherence on feature level instead. It improves the
per-frame features by aggregation of nearby features along the motion paths,
and thus improves the video recognition accuracy. Our method significantly
improves upon strong single-frame baselines in ImageNet VID, especially for
more challenging fast moving objects. Our framework is principled, and on par
with the best engineered systems winning the ImageNet VID challenges 2016,
without additional bells-and-whistles. The proposed method, together with Deep
Feature Flow, powered the winning entry of ImageNet VID challenges 2017. The
code is available at
https://github.com/msracver/Flow-Guided-Feature-Aggregation
Looking Beyond a Clever Narrative: Visual Context and Attention are Primary Drivers of Affect in Video Advertisements
Emotion evoked by an advertisement plays a key role in influencing brand
recall and eventual consumer choices. Automatic ad affect recognition has
several useful applications. However, the use of content-based feature
representations does not give insights into how affect is modulated by aspects
such as the ad scene setting, salient object attributes and their interactions.
Neither do such approaches inform us on how humans prioritize visual
information for ad understanding. Our work addresses these lacunae by
decomposing video content into detected objects, coarse scene structure, object
statistics and actively attended objects identified via eye-gaze. We measure
the importance of each of these information channels by systematically
incorporating related information into ad affect prediction models. Contrary to
the popular notion that ad affect hinges on the narrative and the clever use of
linguistic and social cues, we find that actively attended objects and the
coarse scene structure better encode affective information as compared to
individual scene objects or conspicuous background elements.Comment: Accepted for publication in the Proceedings of 20th ACM International
Conference on Multimodal Interaction, Boulder, CO, US
- …