5,743 research outputs found
Evaluating Content-centric vs User-centric Ad Affect Recognition
Despite the fact that advertisements (ads) often include strongly emotional
content, very little work has been devoted to affect recognition (AR) from ads.
This work explicitly compares content-centric and user-centric ad AR
methodologies, and evaluates the impact of enhanced AR on computational
advertising via a user study. Specifically, we (1) compile an affective ad
dataset capable of evoking coherent emotions across users; (2) explore the
efficacy of content-centric convolutional neural network (CNN) features for
encoding emotions, and show that CNN features outperform low-level emotion
descriptors; (3) examine user-centered ad AR by analyzing Electroencephalogram
(EEG) responses acquired from eleven viewers, and find that EEG signals encode
emotional information better than content descriptors; (4) investigate the
relationship between objective AR and subjective viewer experience while
watching an ad-embedded online video stream based on a study involving 12
users. To our knowledge, this is the first work to (a) expressly compare user
vs content-centered AR for ads, and (b) study the relationship between modeling
of ad emotions and its impact on a real-life advertising application.Comment: Accepted at the ACM International Conference on Multimodal Interation
(ICMI) 201
Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition and Remote Sensing Scene Classification
Designing discriminative powerful texture features robust to realistic
imaging conditions is a challenging computer vision problem with many
applications, including material recognition and analysis of satellite or
aerial imagery. In the past, most texture description approaches were based on
dense orderless statistical distribution of local features. However, most
recent approaches to texture recognition and remote sensing scene
classification are based on Convolutional Neural Networks (CNNs). The d facto
practice when learning these CNN models is to use RGB patches as input with
training performed on large amounts of labeled data (ImageNet). In this paper,
we show that Binary Patterns encoded CNN models, codenamed TEX-Nets, trained
using mapped coded images with explicit texture information provide
complementary information to the standard RGB deep models. Additionally, two
deep architectures, namely early and late fusion, are investigated to combine
the texture and color information. To the best of our knowledge, we are the
first to investigate Binary Patterns encoded CNNs and different deep network
fusion architectures for texture recognition and remote sensing scene
classification. We perform comprehensive experiments on four texture
recognition datasets and four remote sensing scene classification benchmarks:
UC-Merced with 21 scene categories, WHU-RS19 with 19 scene classes, RSSCN7 with
7 categories and the recently introduced large scale aerial image dataset (AID)
with 30 aerial scene types. We demonstrate that TEX-Nets provide complementary
information to standard RGB deep model of the same network architecture. Our
late fusion TEX-Net architecture always improves the overall performance
compared to the standard RGB network on both recognition problems. Our final
combination outperforms the state-of-the-art without employing fine-tuning or
ensemble of RGB network architectures.Comment: To appear in ISPRS Journal of Photogrammetry and Remote Sensin
Comparison of Natural Feature Descriptors for Rigid-Object Tracking for Real-Time Augmented Reality
This paper presents a comparison of natural feature descrip- tors for rigid object tracking for augmented reality (AR) applica- tions. AR relies on object tracking in order to identify a physical object and to superimpose virtual object on an object. Natu- ral feature tracking (NFT) is one approach for computer vision- based object tracking. NFT utilizes interest points of a physcial object, represents them as descriptors, and matches the descrip- tors against reference descriptors in order to identify a phsical object to track. In this research, we investigate four different nat- ural feature descriptors (SIFT, SURF, FREAK, ORB) and their capability to track rigid objects. Rigid objects need robust de- scriptors since they need to describe the objects in a 3D space. AR applications are also real-time application, thus, fast feature matching is mandatory. FREAK and ORB are binary descriptors, which promise a higher performance in comparison to SIFT and SURF. We deployed a test in which we match feature descriptors to artificial rigid objects. The results indicate that the SIFT de- scriptor is the most promising solution in our addressed domain, AR-based assembly training
- …