7,422 research outputs found
Some like it hot - visual guidance for preference prediction
For people first impressions of someone are of determining importance. They
are hard to alter through further information. This begs the question if a
computer can reach the same judgement. Earlier research has already pointed out
that age, gender, and average attractiveness can be estimated with reasonable
precision. We improve the state-of-the-art, but also predict - based on
someone's known preferences - how much that particular person is attracted to a
novel face. Our computational pipeline comprises a face detector, convolutional
neural networks for the extraction of deep features, standard support vector
regression for gender, age and facial beauty, and - as the main novelties -
visual regularized collaborative filtering to infer inter-person preferences as
well as a novel regression technique for handling visual queries without rating
history. We validate the method using a very large dataset from a dating site
as well as images from celebrities. Our experiments yield convincing results,
i.e. we predict 76% of the ratings correctly solely based on an image, and
reveal some sociologically relevant conclusions. We also validate our
collaborative filtering solution on the standard MovieLens rating dataset,
augmented with movie posters, to predict an individual's movie rating. We
demonstrate our algorithms on howhot.io which went viral around the Internet
with more than 50 million pictures evaluated in the first month.Comment: accepted for publication at CVPR 201
Brain-mediated Transfer Learning of Convolutional Neural Networks
The human brain can effectively learn a new task from a small number of
samples, which indicate that the brain can transfer its prior knowledge to
solve tasks in different domains. This function is analogous to transfer
learning (TL) in the field of machine learning. TL uses a well-trained feature
space in a specific task domain to improve performance in new tasks with
insufficient training data. TL with rich feature representations, such as
features of convolutional neural networks (CNNs), shows high generalization
ability across different task domains. However, such TL is still insufficient
in making machine learning attain generalization ability comparable to that of
the human brain. To examine if the internal representation of the brain could
be used to achieve more efficient TL, we introduce a method for TL mediated by
human brains. Our method transforms feature representations of audiovisual
inputs in CNNs into those in activation patterns of individual brains via their
association learned ahead using measured brain responses. Then, to estimate
labels reflecting human cognition and behavior induced by the audiovisual
inputs, the transformed representations are used for TL. We demonstrate that
our brain-mediated TL (BTL) shows higher performance in the label estimation
than the standard TL. In addition, we illustrate that the estimations mediated
by different brains vary from brain to brain, and the variability reflects the
individual variability in perception. Thus, our BTL provides a framework to
improve the generalization ability of machine-learning feature representations
and enable machine learning to estimate human-like cognition and behavior,
including individual variability
Looking Beyond a Clever Narrative: Visual Context and Attention are Primary Drivers of Affect in Video Advertisements
Emotion evoked by an advertisement plays a key role in influencing brand
recall and eventual consumer choices. Automatic ad affect recognition has
several useful applications. However, the use of content-based feature
representations does not give insights into how affect is modulated by aspects
such as the ad scene setting, salient object attributes and their interactions.
Neither do such approaches inform us on how humans prioritize visual
information for ad understanding. Our work addresses these lacunae by
decomposing video content into detected objects, coarse scene structure, object
statistics and actively attended objects identified via eye-gaze. We measure
the importance of each of these information channels by systematically
incorporating related information into ad affect prediction models. Contrary to
the popular notion that ad affect hinges on the narrative and the clever use of
linguistic and social cues, we find that actively attended objects and the
coarse scene structure better encode affective information as compared to
individual scene objects or conspicuous background elements.Comment: Accepted for publication in the Proceedings of 20th ACM International
Conference on Multimodal Interaction, Boulder, CO, US
Current Challenges and Visions in Music Recommender Systems Research
Music recommender systems (MRS) have experienced a boom in recent years,
thanks to the emergence and success of online streaming services, which
nowadays make available almost all music in the world at the user's fingertip.
While today's MRS considerably help users to find interesting music in these
huge catalogs, MRS research is still facing substantial challenges. In
particular when it comes to build, incorporate, and evaluate recommendation
strategies that integrate information beyond simple user--item interactions or
content-based descriptors, but dig deep into the very essence of listener
needs, preferences, and intentions, MRS research becomes a big endeavor and
related publications quite sparse.
The purpose of this trends and survey article is twofold. We first identify
and shed light on what we believe are the most pressing challenges MRS research
is facing, from both academic and industry perspectives. We review the state of
the art towards solving these challenges and discuss its limitations. Second,
we detail possible future directions and visions we contemplate for the further
evolution of the field. The article should therefore serve two purposes: giving
the interested reader an overview of current challenges in MRS research and
providing guidance for young researchers by identifying interesting, yet
under-researched, directions in the field
- …