12,260 research outputs found
Multimodal Content Analysis for Effective Advertisements on YouTube
The rapid advances in e-commerce and Web 2.0 technologies have greatly
increased the impact of commercial advertisements on the general public. As a
key enabling technology, a multitude of recommender systems exists which
analyzes user features and browsing patterns to recommend appealing
advertisements to users. In this work, we seek to study the characteristics or
attributes that characterize an effective advertisement and recommend a useful
set of features to aid the designing and production processes of commercial
advertisements. We analyze the temporal patterns from multimedia content of
advertisement videos including auditory, visual and textual components, and
study their individual roles and synergies in the success of an advertisement.
The objective of this work is then to measure the effectiveness of an
advertisement, and to recommend a useful set of features to advertisement
designers to make it more successful and approachable to users. Our proposed
framework employs the signal processing technique of cross modality feature
learning where data streams from different components are employed to train
separate neural network models and are then fused together to learn a shared
representation. Subsequently, a neural network model trained on this joint
feature embedding representation is utilized as a classifier to predict
advertisement effectiveness. We validate our approach using subjective ratings
from a dedicated user study, the sentiment strength of online viewer comments,
and a viewer opinion metric of the ratio of the Likes and Views received by
each advertisement from an online platform.Comment: 11 pages, 5 figures, ICDM 201
Musical Robots For Children With ASD Using A Client-Server Architecture
Presented at the 22nd International Conference on Auditory Display (ICAD-2016)People with Autistic Spectrum Disorders (ASD) are known to have difficulty recognizing and expressing emotions, which affects their social integration. Leveraging the recent advances in interactive robot and music therapy approaches, and integrating both, we have designed musical robots that can facilitate social and emotional interactions of children with ASD. Robots communicate with children with ASD while detecting their emotional states and physical activities and then, make real-time sonification based on the interaction data. Given that we envision the use of multiple robots with children, we have adopted a client-server architecture. Each robot and sensing device plays a role as a terminal, while the sonification server processes all the data and generates harmonized sonification. After describing our goals for the use of sonification, we detail the system architecture and on-going research scenarios. We believe that the present paper offers a new perspective on the sonification application for assistive technologies
The development of a rich multimedia training environment for crisis management: using emotional affect to enhance learning
PANDORA is an EU FP7-funded project developing a novel training and learning environment for Gold Commanders, individuals who carry executive responsibility for the services and facilities identified as strategically critical e.g. Police, Fire, in crisis management strategic planning situations. A key part of the work for this project is considering the emotional and behavioural state of the trainees, and the creation of more realistic, and thereby stressful, representations of multimedia information to impact on the decision-making of those trainees. Existing training models are predominantly paper-based, table-top exercises, which require an exercise of imagination on the part of the trainees to consider not only the various aspects of a crisis situation but also the impacts of interventions, and remediating actions in the event of the failure of an intervention. However, existing computing models and tools are focused on supporting tactical and operational activities in crisis management, not strategic. Therefore, the PANDORA system will provide a rich multimedia information environment, to provide trainees with the detailed information they require to develop strategic plans to deal with a crisis scenario, and will then provide information on the impacts of the implementation of those plans and provide the opportunity for the trainees to revise and remediate those plans. Since this activity is invariably multi-agency, the training environment must support group-based strategic planning activities and trainees will occupy specific roles within the crisis scenario. The system will also provide a range of non-playing characters (NPC) representing domain experts, high-level controllers (e.g. politicians, ministers), low-level controllers (tactical and operational commanders), and missing trainee roles, to ensure a fully populated scenario can be realised in each instantiation. Within the environment, the emotional and behavioural state of the trainees will be monitored, and interventions, in the form of environmental information controls and mechanisms impacting on the stress levels and decisionmaking capabilities of the trainees, will be used to personalise the training environment. This approach enables a richer and more realistic representation of the crisis scenario to be enacted, leading to better strategic plans and providing trainees with structured feedback on their performance under stress
Learning Grimaces by Watching TV
Differently from computer vision systems which require explicit supervision,
humans can learn facial expressions by observing people in their environment.
In this paper, we look at how similar capabilities could be developed in
machine vision. As a starting point, we consider the problem of relating facial
expressions to objectively measurable events occurring in videos. In
particular, we consider a gameshow in which contestants play to win significant
sums of money. We extract events affecting the game and corresponding facial
expressions objectively and automatically from the videos, obtaining large
quantities of labelled data for our study. We also develop, using benchmarks
such as FER and SFEW 2.0, state-of-the-art deep neural networks for facial
expression recognition, showing that pre-training on face verification data can
be highly beneficial for this task. Then, we extend these models to use facial
expressions to predict events in videos and learn nameable expressions from
them. The dataset and emotion recognition models are available at
http://www.robots.ox.ac.uk/~vgg/data/facevalueComment: British Machine Vision Conference (BMVC) 201
Affective feedback: an investigation into the role of emotions in the information seeking process
User feedback is considered to be a critical element in the information seeking process, especially in relation to relevance assessment. Current feedback techniques determine content relevance with respect to the cognitive and situational levels of interaction that occurs between the user and the retrieval system. However, apart from real-life problems and information objects, users interact with intentions, motivations and feelings, which can be seen as critical aspects of cognition and decision-making. The study presented in this paper serves as a starting point to the exploration of the role of emotions in the information seeking process. Results show that the latter not only interweave with different physiological, psychological and cognitive processes, but also form distinctive patterns, according to specific task, and according to specific user
- …