62,967 research outputs found
Spott : on-the-spot e-commerce for television using deep learning-based video analysis techniques
Spott is an innovative second screen mobile multimedia application which offers viewers relevant information on objects (e.g., clothing, furniture, food) they see and like on their television screens. The application enables interaction between TV audiences and brands, so producers and advertisers can offer potential consumers tailored promotions, e-shop items, and/or free samples. In line with the current views on innovation management, the technological excellence of the Spott application is coupled with iterative user involvement throughout the entire development process. This article discusses both of these aspects and how they impact each other. First, we focus on the technological building blocks that facilitate the (semi-) automatic interactive tagging process of objects in the video streams. The majority of these building blocks extensively make use of novel and state-of-the-art deep learning concepts and methodologies. We show how these deep learning based video analysis techniques facilitate video summarization, semantic keyframe clustering, and (similar) object retrieval. Secondly, we provide insights in user tests that have been performed to evaluate and optimize the application's user experience. The lessons learned from these open field tests have already been an essential input in the technology development and will further shape the future modifications to the Spott application
Augmenting Sensorimotor Control Using âGoal-Awareâ Vibrotactile Stimulation during Reaching and Manipulation Behaviors
We describe two sets of experiments that examine the ability of vibrotactile encoding of simple position error and combined object states (calculated from an optimal controller) to enhance performance of reaching and manipulation tasks in healthy human adults. The goal of the first experiment (tracking) was to follow a moving target with a cursor on a computer screen. Visual and/or vibrotactile cues were provided in this experiment, and vibrotactile feedback was redundant with visual feedback in that it did not encode any information above and beyond what was already available via vision. After only 10 minutes of practice using vibrotactile feedback to guide performance, subjects tracked the moving target with response latency and movement accuracy values approaching those observed under visually guided reaching. Unlike previous reports on multisensory enhancement, combining vibrotactile and visual feedback of performance errors conferred neither positive nor negative effects on task performance. In the second experiment (balancing), vibrotactile feedback encoded a corrective motor command as a linear combination of object states (derived from a linear-quadratic regulator implementing a trade-off between kinematic and energetic performance) to teach subjects how to balance a simulated inverted pendulum. Here, the tactile feedback signal differed from visual feedback in that it provided information that was not readily available from visual feedback alone. Immediately after applying this novel âgoal-awareâ vibrotactile feedback, time to failure was improved by a factor of three. Additionally, the effect of vibrotactile training persisted after the feedback was removed. These results suggest that vibrotactile encoding of appropriate combinations of state information may be an effective form of augmented sensory feedback that can be applied, among other purposes, to compensate for lost or compromised proprioception as commonly observed, for example, in stroke survivors
Mirror, Mirror, On The Wall: Collaborative Screen-Mirroring for Small Groups
Screen mirroring has been available to consumers for some time, however if every mobile device in the room supports screen mirroring to the main display (e.g. a shared TV), this necessitates a mechanism for managing its use. As such, this paper investigates allowing users in small intimacy groups (friends, family etc.) to self-manage mirrored use of the display, through passing/taking/requesting the display from whomever is currently mirroring to it. We examine the collaborative benefits this scheme could provide for the home, compared to existing multi-device use and existing screen mirroring implementations. Results indicate shared screen mirroring improves perceived collaboration, decreases dominance, preserves independence and has a positive effect on a group's activity awareness
Examining the role of smart TVs and VR HMDs in synchronous at-a-distance media consumption
This article examines synchronous at-a-distance media consumption from two perspectives: How it can be facilitated using existing consumer displays (through TVs combined with smartphones), and imminently available consumer displays (through virtual reality (VR) HMDs combined with RGBD sensing). First, we discuss results from an initial evaluation of a synchronous shared at-a-distance smart TV system, CastAway. Through week-long in-home deployments with five couples, we gain formative insights into the adoption and usage of at-a-distance media consumption and how couples communicated during said consumption. We then examine how the imminent availability and potential adoption of consumer VR HMDs could affect preferences toward how synchronous at-a-distance media consumption is conducted, in a laboratory study of 12 pairs, by enhancing media immersion and supporting embodied telepresence for communication. Finally, we discuss the implications these studies have for the near-future of consumer synchronous at-a-distance media consumption. When combined, these studies begin to explore a design space regarding the varying ways in which at-a-distance media consumption can be supported and experienced (through music, TV content, augmenting existing TV content for immersion, and immersive VR content), what factors might influence usage and adoption and the implications for supporting communication and telepresence during media consumption
Multimodal Content Analysis for Effective Advertisements on YouTube
The rapid advances in e-commerce and Web 2.0 technologies have greatly
increased the impact of commercial advertisements on the general public. As a
key enabling technology, a multitude of recommender systems exists which
analyzes user features and browsing patterns to recommend appealing
advertisements to users. In this work, we seek to study the characteristics or
attributes that characterize an effective advertisement and recommend a useful
set of features to aid the designing and production processes of commercial
advertisements. We analyze the temporal patterns from multimedia content of
advertisement videos including auditory, visual and textual components, and
study their individual roles and synergies in the success of an advertisement.
The objective of this work is then to measure the effectiveness of an
advertisement, and to recommend a useful set of features to advertisement
designers to make it more successful and approachable to users. Our proposed
framework employs the signal processing technique of cross modality feature
learning where data streams from different components are employed to train
separate neural network models and are then fused together to learn a shared
representation. Subsequently, a neural network model trained on this joint
feature embedding representation is utilized as a classifier to predict
advertisement effectiveness. We validate our approach using subjective ratings
from a dedicated user study, the sentiment strength of online viewer comments,
and a viewer opinion metric of the ratio of the Likes and Views received by
each advertisement from an online platform.Comment: 11 pages, 5 figures, ICDM 201
Pervasive and standalone computing: The perceptual effects of variable multimedia quality.
The introduction of multimedia on pervasive and mobile communication devices raises a number of perceptual quality issues, however, limited work has been done examining the 3-way interaction between use of equipment, quality of perception and quality of service. Our work measures levels of informational transfer (objective) and user satisfaction (subjective)when users are presented with multimedia video clips at three different frame rates, using four different display devices, simulating variation in participant mobility. Our results will show that variation in frame-rate does not impact a userâs level of information assimilation, however, does impact a usersâ perception of multimedia video âqualityâ. Additionally, increased visual immersion can be used to increase transfer of video information, but can negatively affect the usersâ perception of âqualityâ. Finally, we illustrate the significant affect of clip-content on the transfer of video, audio and textual information, placing into doubt the use of purely objective quality definitions when considering multimedia
presentations
Evaluating Multimodal Driver Displays of Varying Urgency
Previous studies have evaluated Audio, Visual and Tactile warnings for drivers, highlighting the importance of conveying the appropriate level of urgency through the signals. However, these modalities have never been combined exhaustively with different urgency levels and tested while using a driving simulator. This paper describes two experiments investigating all multimodal combinations of such warnings along three different levels of designed urgency. The warnings were first evaluated in terms of perceived urgency and perceived annoyance in the context of a driving simulator. The results showed that the perceived urgency matched the designed urgency of the warnings. More urgent warnings were also rated as more annoying but the effect of annoyance was lower compared to urgency. The warnings were then tested for recognition time when presented during a simulated driving task. It was found that warnings of high urgency induced quicker and more accurate responses than warnings of medium and of low urgency. In both studies, the number of modalities used in warnings (one, two or three) affected both subjective and objective responses. More modalities led to higher ratings of urgency and annoyance, with annoyance having a lower effect compared to urgency. More modalities also led to quicker responses. These results provide implications for multimodal warning design and reveal how modalities and modality combinations can influence participant responses during a simulated driving task
- âŠ