3,098 research outputs found

    Multimodal Content Analysis for Effective Advertisements on YouTube

    Full text link
    The rapid advances in e-commerce and Web 2.0 technologies have greatly increased the impact of commercial advertisements on the general public. As a key enabling technology, a multitude of recommender systems exists which analyzes user features and browsing patterns to recommend appealing advertisements to users. In this work, we seek to study the characteristics or attributes that characterize an effective advertisement and recommend a useful set of features to aid the designing and production processes of commercial advertisements. We analyze the temporal patterns from multimedia content of advertisement videos including auditory, visual and textual components, and study their individual roles and synergies in the success of an advertisement. The objective of this work is then to measure the effectiveness of an advertisement, and to recommend a useful set of features to advertisement designers to make it more successful and approachable to users. Our proposed framework employs the signal processing technique of cross modality feature learning where data streams from different components are employed to train separate neural network models and are then fused together to learn a shared representation. Subsequently, a neural network model trained on this joint feature embedding representation is utilized as a classifier to predict advertisement effectiveness. We validate our approach using subjective ratings from a dedicated user study, the sentiment strength of online viewer comments, and a viewer opinion metric of the ratio of the Likes and Views received by each advertisement from an online platform.Comment: 11 pages, 5 figures, ICDM 201

    Looking Beyond a Clever Narrative: Visual Context and Attention are Primary Drivers of Affect in Video Advertisements

    Full text link
    Emotion evoked by an advertisement plays a key role in influencing brand recall and eventual consumer choices. Automatic ad affect recognition has several useful applications. However, the use of content-based feature representations does not give insights into how affect is modulated by aspects such as the ad scene setting, salient object attributes and their interactions. Neither do such approaches inform us on how humans prioritize visual information for ad understanding. Our work addresses these lacunae by decomposing video content into detected objects, coarse scene structure, object statistics and actively attended objects identified via eye-gaze. We measure the importance of each of these information channels by systematically incorporating related information into ad affect prediction models. Contrary to the popular notion that ad affect hinges on the narrative and the clever use of linguistic and social cues, we find that actively attended objects and the coarse scene structure better encode affective information as compared to individual scene objects or conspicuous background elements.Comment: Accepted for publication in the Proceedings of 20th ACM International Conference on Multimodal Interaction, Boulder, CO, US

    Affect Recognition in Ads with Application to Computational Advertising

    Get PDF
    Advertisements (ads) often include strongly emotional content to leave a lasting impression on the viewer. This work (i) compiles an affective ad dataset capable of evoking coherent emotions across users, as determined from the affective opinions of five experts and 14 annotators; (ii) explores the efficacy of convolutional neural network (CNN) features for encoding emotions, and observes that CNN features outperform low-level audio-visual emotion descriptors upon extensive experimentation; and (iii) demonstrates how enhanced affect prediction facilitates computational advertising, and leads to better viewing experience while watching an online video stream embedded with ads based on a study involving 17 users. We model ad emotions based on subjective human opinions as well as objective multimodal features, and show how effectively modeling ad emotions can positively impact a real-life application.Comment: Accepted at the ACM International Conference on Multimedia (ACM MM) 201

    Short-Video Marketing in E-commerce: Analyzing and Predicting Consumer Response

    Get PDF
    This study analyzes and predicts consumer viewing response to e-commerce short-videos (ESVs). We first construct a large-scale ESV dataset that contains 23,001 ESVs across 40 product categories. The dataset consists of the consumer response label in terms of average viewing durations and human-annotated ESV content attributes. Using the constructed dataset and mixed-effects model, we find that product description, product demonstration, pleasure, and aesthetics are four key determinants of ESV viewing duration. Furthermore, we design a content-based multimodal-multitask framework to predict consumer viewing response to ESVs. We propose the information distillation module to extract the shared, special, and conflicted information from ESV multimodal features. Additionally, we employ a hierarchical multitask classification module to capture feature-level and label-level dependencies. We conduct extensive experiments to evaluate the prediction performance of our proposed framework. Taken together, our paper provides theoretical and methodological contributions to the IS and relevant literature

    Visual and Verbal Means to Attract our Clicks: Multimodality in Youtube Thumbnails

    Get PDF
    Pictures, color, tones, and motions have all been identified as modalities that help to create the meaning-making process. A multimodal message is made up as two or more modes work together to give meaning for the overall discourse. This article is to describe how the visual and verbal signs work together in constructing meaning in video thumbnails. This study is used as a descriptive research method. The data are thumbnails of the most-viewed videos in Close the Door podcast.  They were analyzed by employing Kress& van Leeuwen's Visual Grammar and Halliday’s Functional Grammar of language, especially ideational meaning of clause, transitivity. These are to explain the relationships between the images and the texts and elucidates the functions of images in meaning interpretation. Based on the analysis, all the thumbnails evoke a clear message video; they have no ambiguity and are in line with content of the video. Amongst 4 data, there are four types used; relational, mental, material, verbal with descriptive sentences by three data and one imperative.  In terms of MDA Visual, the videos have Lead, Display and Emblems as the relationships between the images and the functions of images in meaning interpretation

    Inefficiencies in Digital Advertising Markets

    Get PDF
    Digital advertising markets are growing and attracting increased scrutiny. This article explores four market inefficiencies that remain poorly understood: ad effect measurement, frictions between and within advertising channel members, ad blocking, and ad fraud. Although these topics are not unique to digital advertising, each manifests in unique ways in markets for digital ads. The authors identify relevant findings in the academic literature, recent developments in practice, and promising topics for future research
    corecore