9,100 research outputs found
Multimodal Content Analysis for Effective Advertisements on YouTube
The rapid advances in e-commerce and Web 2.0 technologies have greatly
increased the impact of commercial advertisements on the general public. As a
key enabling technology, a multitude of recommender systems exists which
analyzes user features and browsing patterns to recommend appealing
advertisements to users. In this work, we seek to study the characteristics or
attributes that characterize an effective advertisement and recommend a useful
set of features to aid the designing and production processes of commercial
advertisements. We analyze the temporal patterns from multimedia content of
advertisement videos including auditory, visual and textual components, and
study their individual roles and synergies in the success of an advertisement.
The objective of this work is then to measure the effectiveness of an
advertisement, and to recommend a useful set of features to advertisement
designers to make it more successful and approachable to users. Our proposed
framework employs the signal processing technique of cross modality feature
learning where data streams from different components are employed to train
separate neural network models and are then fused together to learn a shared
representation. Subsequently, a neural network model trained on this joint
feature embedding representation is utilized as a classifier to predict
advertisement effectiveness. We validate our approach using subjective ratings
from a dedicated user study, the sentiment strength of online viewer comments,
and a viewer opinion metric of the ratio of the Likes and Views received by
each advertisement from an online platform.Comment: 11 pages, 5 figures, ICDM 201
Affect Recognition in Ads with Application to Computational Advertising
Advertisements (ads) often include strongly emotional content to leave a
lasting impression on the viewer. This work (i) compiles an affective ad
dataset capable of evoking coherent emotions across users, as determined from
the affective opinions of five experts and 14 annotators; (ii) explores the
efficacy of convolutional neural network (CNN) features for encoding emotions,
and observes that CNN features outperform low-level audio-visual emotion
descriptors upon extensive experimentation; and (iii) demonstrates how enhanced
affect prediction facilitates computational advertising, and leads to better
viewing experience while watching an online video stream embedded with ads
based on a study involving 17 users. We model ad emotions based on subjective
human opinions as well as objective multimodal features, and show how
effectively modeling ad emotions can positively impact a real-life application.Comment: Accepted at the ACM International Conference on Multimedia (ACM MM)
201
Text-based Sentiment Analysis and Music Emotion Recognition
Nowadays, with the expansion of social media, large amounts of user-generated
texts like tweets, blog posts or product reviews are shared online. Sentiment polarity
analysis of such texts has become highly attractive and is utilized in recommender
systems, market predictions, business intelligence and more. We also witness deep
learning techniques becoming top performers on those types of tasks. There are
however several problems that need to be solved for efficient use of deep neural
networks on text mining and text polarity analysis.
First of all, deep neural networks are data hungry. They need to be fed with
datasets that are big in size, cleaned and preprocessed as well as properly labeled.
Second, the modern natural language processing concept of word embeddings as a
dense and distributed text feature representation solves sparsity and dimensionality
problems of the traditional bag-of-words model. Still, there are various uncertainties
regarding the use of word vectors: should they be generated from the same dataset
that is used to train the model or it is better to source them from big and popular
collections that work as generic text feature representations? Third, it is not easy for
practitioners to find a simple and highly effective deep learning setup for various
document lengths and types. Recurrent neural networks are weak with longer texts
and optimal convolution-pooling combinations are not easily conceived. It is thus
convenient to have generic neural network architectures that are effective and can
adapt to various texts, encapsulating much of design complexity.
This thesis addresses the above problems to provide methodological and practical
insights for utilizing neural networks on sentiment analysis of texts and achieving
state of the art results. Regarding the first problem, the effectiveness of various
crowdsourcing alternatives is explored and two medium-sized and emotion-labeled
song datasets are created utilizing social tags. One of the research interests of Telecom
Italia was the exploration of relations between music emotional stimulation and
driving style. Consequently, a context-aware music recommender system that aims
to enhance driving comfort and safety was also designed. To address the second
problem, a series of experiments with large text collections of various contents and
domains were conducted. Word embeddings of different parameters were exercised
and results revealed that their quality is influenced (mostly but not only) by the
size of texts they were created from. When working with small text datasets, it is
thus important to source word features from popular and generic word embedding
collections. Regarding the third problem, a series of experiments involving convolutional
and max-pooling neural layers were conducted. Various patterns relating
text properties and network parameters with optimal classification accuracy were
observed. Combining convolutions of words, bigrams, and trigrams with regional
max-pooling layers in a couple of stacks produced the best results. The derived
architecture achieves competitive performance on sentiment polarity analysis of
movie, business and product reviews.
Given that labeled data are becoming the bottleneck of the current deep learning
systems, a future research direction could be the exploration of various data programming
possibilities for constructing even bigger labeled datasets. Investigation
of feature-level or decision-level ensemble techniques in the context of deep neural
networks could also be fruitful. Different feature types do usually represent complementary
characteristics of data. Combining word embedding and traditional text
features or utilizing recurrent networks on document splits and then aggregating the
predictions could further increase prediction accuracy of such models
Investigating the Role of Emotion Perception in the Adaptive Functioning of Individuals on the Autism Spectrum
Cognitive functioning has historically been used to predict adaptive outcomes of individuals with autism spectrum disorders; however, research shows that it does not adequately predict these outcomes. Therefore, the current study explored the role of emotion perception in the adaptive functioning of individuals with ASDs. Emotion perception was assessed using the DANVA-2, which has audio and static face stimuli, and the DAVE, dynamic, audio-visual emotion movies. Adaptive functioning was assessed using the Vineland-II Socialization, Communication, and Daily Living domains. Results indicated that individuals with ASDs demonstrated significant impairments in both adaptive functioning and emotion perception compared to typical individuals. Findings did not demonstrate a relationship between emotion perception and adaptive functioning, controlling for IQ. Future research should broaden the approach when investigating possible mechanisms of change for adaptive outcomes to include exploration of social perception more broadly, of which emotion perception is one component, and its relationship with adaptive outcomes
Spontaneous blink rate as an index of attention and emotion during film clips viewing
Spontaneous blinking is a non-invasive indicator known to reflect dopaminergic influence over frontal cortex and attention allocation in perceptual tasks. 38 participants watched eighteen short film clips (2 min), designed to elicit specific affective states, and arranged in six different emotional categories, while their eye movements were recorded from the vertical electroculogram. The largest blink rate inhibition, reflecting greater attention allocation to the movie, was observed during the presentation of Erotic clips, excerpts on wilderness depicting beautiful landscapes (Scenery), as well as clips showing crying characters (Compassion). Instead, the minimum blink rate inhibition was found for Fear clips, which induced a defensive response with stimulus rejection. Blink rate across time evidenced how Compassion clips elicited early inhibition while Sadness clips induced a slower, later inhibition. Correlation analyses also revealed a negative correlation (r < -0.40) between total blink rate recorded during Erotic and Compassion clips and self-reported interest. Overall, the main variable explaining blink rate was emotional Valence. Results suggest that blink modulation is related with the motivational relevance and biological significance of the stimuli, tracking their differential recruitment of attentional resources. Furthermore, they provide a solid background for studying the emotion-attention patterns and their deficits also in clinical samples (e.g., neurological and psychiatric patients) using spontaneous blinking as a not-interfering psychophysiological measure
- âŠ