1,296 research outputs found

    Viewability prediction for display advertising

    Get PDF
    As a massive industry, display advertising delivers advertisers’ marketing messages to attract customers through graphic banners on webpages. Display advertising is also the most essential revenue source of online publishers. Currently, advertisers are charged by user response or ad serving. However, recent studies show that users barely click or convert display ads. Moreover, about half of the ads are actually never seen by users. In this case, advertisers cannot enhance their brand awareness and increase return on investment. Publishers also lose much revenue. Therefore, the ad pricing standards are shifting to a new model: ad impressions are paid if they are viewable, not just being responded to or served. The Media Ratings Council’s standard for a viewable display impression is a minimum of 50% of pixels in view for a minimum of one second. To implement viewable impressions as pricing currency, ad viewability should be accurately predicted. Ad viewability prediction can improve the performance of guaranteed ad delivery, real-time bidding, as well as recommender systems. This research is the first to address this important problem of ad viewability prediction. Inspired by the standard definition of viewability, this study proposes to solve the problem from two angles: 1) scrolling behavior and 2) dwell time. In the first phase, ad viewability is predicted by estimating the probability that a user will scroll to the page depth where an ad is located in a specific page view. Two novel probabilistic latent class models (PLC) are proposed. The first PLC model computes constant use and page memberships offline, while the second PLC model computes dynamic memberships in real-time. In the second phase, ad viewability is predicted by estimating the probability that the page depth will be in-view for certain seconds. Machine learning models based on Factorization Machines (FM) and Recurrent Neural Network (RNN) with Long Short Term Memory (LSTM) are proposed to predict the viewability of any given page depth in a specific page view. The experiments show that the proposed algorithms significantly outperform the comparison systems

    Going Beyond Relevance: Role of effort in Information Retrieval

    Get PDF
    The primary focus of Information Retrieval (IR) systems has been to optimize for Relevance. Existing approaches to rank documents or evaluate IR systems does not account for “user effort”. Currently, judges only determine whether the information provided in a given document would satisfy the underlying information need in a query. The current mechanism of obtaining relevance judgments does not account for time and effort that an end user must put forth to consume its content. While a judge may spend a lot of time assessing a document, an impatient user may not devote the same amount of time and effort to consume its content. This problem is exacerbated on smaller devices like mobile. While on mobile or tablets, with limited interaction, users may not put in too much effort in finding information. This thesis characterizes and incorporates effort in Information Retrieval. Comparison of explicit and implicit relevance judgments across several datasets reveals that certain documents are marked relevant by the judges but are of low utility to an end user. Experiments indicate that document-level effort features can reliably predict the mismatch between dwell time and judging time of documents. Explicit and preference-based judgments were collected to determine which factors associated with effort agreed the most with user satisfaction. The ability to locate relevant information or findability was found to be in highest agreement with preference judgments. Findability judgments were also gathered to study the association of different annotator, query or document related properties with effort judgments. We also investigate how can existing systems be optimized for relevance and effort. Finally, we investigate the role of effort on smaller devices with the help of cost-benefit models

    Addressing the new generation of spam (Spam 2.0) through Web usage models

    Get PDF
    New Internet collaborative media introduce new ways of communicating that are not immune to abuse. A fake eye-catching profile in social networking websites, a promotional review, a response to a thread in online forums with unsolicited content or a manipulated Wiki page, are examples of new the generation of spam on the web, referred to as Web 2.0 Spam or Spam 2.0. Spam 2.0 is defined as the propagation of unsolicited, anonymous, mass content to infiltrate legitimate Web 2.0 applications.The current literature does not address Spam 2.0 in depth and the outcome of efforts to date are inadequate. The aim of this research is to formalise a definition for Spam 2.0 and provide Spam 2.0 filtering solutions. Early-detection, extendibility, robustness and adaptability are key factors in the design of the proposed method.This dissertation provides a comprehensive survey of the state-of-the-art web spam and Spam 2.0 filtering methods to highlight the unresolved issues and open problems, while at the same time effectively capturing the knowledge in the domain of spam filtering.This dissertation proposes three solutions in the area of Spam 2.0 filtering including: (1) characterising and profiling Spam 2.0, (2) Early-Detection based Spam 2.0 Filtering (EDSF) approach, and (3) On-the-Fly Spam 2.0 Filtering (OFSF) approach. All the proposed solutions are tested against real-world datasets and their performance is compared with that of existing Spam 2.0 filtering methods.This work has coined the term ‘Spam 2.0’, provided insight into the nature of Spam 2.0, and proposed filtering mechanisms to address this new and rapidly evolving problem

    Estimating attention flow in online video networks

    Full text link
    © 2019 Association for Computing Machinery. Online videos have shown tremendous increase in Internet traffic. Most video hosting sites implement recommender systems, which connect the videos into a directed network and conceptually act as a source of pathways for users to navigate. At present, little is known about how human attention is allocated over such large-scale networks, and about the impacts of the recommender systems. In this paper, we first construct the Vevo network — a YouTube video network with 60,740 music videos interconnected by the recommendation links, and we collect their associated viewing dynamics. This results in a total of 310 million views every day over a period of 9 weeks. Next, we present large-scale measurements that connect the structure of the recommendation network and the video attention dynamics. We use the bow-tie structure to characterize the Vevo network and we find that its core component (23.1% of the videos), which occupies most of the attention (82.6% of the views), is made out of videos that are mainly recommended among themselves. This is indicative of the links between video recommendation and the inequality of attention allocation. Finally, we address the task of estimating the attention flow in the video recommendation network. We propose a model that accounts for the network effects for predicting video popularity, and we show it consistently outperforms the baselines. This model also identifies a group of artists gaining attention because of the recommendation network. Altogether, our observations and our models provide a new set of tools to better understand the impacts of recommender systems on collective social attention

    An interactive ImageJ plugin for semi-automated image denoising in electron microscopy

    Get PDF
    The recent advent of 3D in electron microscopy (EM) has allowed for detection of nanometer resolution structures. This has caused an explosion in dataset size, necessitating the development of automated workflows. Moreover, large 3D EM datasets typically require hours to days to be acquired and accelerated imaging typically results in noisy data. Advanced denoising techniques can alleviate this, but tend to be less accessible to the community due to low-level programming environments, complex parameter tuning or a computational bottleneck. We present DenoisEM: an interactive and GPU accelerated denoising plugin for ImageJ that ensures fast parameter tuning and processing through parallel computing. Experimental results show that DenoisEM is one order of magnitude faster than related software and can accelerate data acquisition by a factor of 4 without significantly affecting data quality. Lastly, we show that image denoising benefits visualization and (semi-)automated segmentation and analysis of ultrastructure in various volume EM datasets

    Reserve price optimization in display advertising

    Get PDF
    Display advertising is the main type of online advertising, and it comes in the form of banner ads and rich media on publishers\u27 websites. Publishers sell ad impressions, where an impression is one display of an ad in a web page. A common way to sell ad impressions is through real-time bidding (RTB). In 2019, advertisers in the United States spent nearly 60 billion U.S. dollars on programmatic digital display advertising. By 2022, expenditures are expected to increase to nearly 95 billion U.S. dollars. In general, the remaining impressions are sold directly by the publishers. The only way for publishers to control the price of the impressions they sell through RTB is by setting up a reserve price, which has to be beaten by the winning bids. The two main types of RTB auction strategies are 1) first-price auctions, i.e., the winning advertiser pays the highest bid, and 2) second-price auctions, i.e., the winning advertiser pays the maximum of the second highest bid and the reserve price (the minimum price that a publisher can accept for an impression). In both types of auctions, bids lower than the reserve prices will be automatically rejected. Since both strategies are influenced by the reserve price, setting a good reserve price is an important, but challenging task for publishers. A high reserve price may lead to very few winning bids, and thus can decrease the revenue substantially. A low reserve price may devalue the impressions and hurt the revenue because advertisers do not need to bid high to beat the reserve. Reduction of ad revenue may affect the quality of free content and publishers\u27 business sustainability. Therefore, in an ideal situation, the publishers would like to set the reserve price as high as possible, while ensuring that there is a winning bid. This dissertation proposes to use machine learning techniques to determine the optimal reserve prices for individual impressions in real-time, with the goal of maximizing publishers\u27 ad revenue. The proposed techniques are practical because they use data only available to publishers. They are also general because they can be applied to most online publishers. The novelty of the research comes from both the problem, which was not studied before, and the proposed techniques, which are adapted to the online publishing domain. For second-price auctions, a survival-analysis-based model is first proposed to predict failure rates of reserve prices of specific impressions in second-price auctions. It uses factorization machines (FM) to capture feature interaction and header bidding information to improve the prediction performance. The experiments, using data from a large media company, show that the proposed model for failure rate prediction outperforms the comparative systems. The survival-analysis-based model is augmented further with a deep neural network (DNN) to capture the feature interaction. The experiments show that the DNN-based model further improves the performance from the FM-based one. For first-price auctions, a multi-task learning framework is proposed to predict the lower bounds of highest bids with a coverage probability. The model can guarantee the highest bids of at least a certain percentage of impressions are more than the corresponding predicted lower bounds. Setting the final reserve prices to the lower bounds, the model can guarantee a certain percentage of outbid impressions in real-time bidding. The experiments show that the proposed method can significantly outperform the comparison systems

    Towards a Feature-Rich Data Set for Personalized Access to Long-Tail Content

    Get PDF
    Personalized data access has become one of the core challenges for intelligent information access, especially for non- mainstream long-tail content, as can be found in digital libraries. One of the main reasons that personalization remains a difficult task is the lack of standardized test corpora. In this paper we provide a comprehensive analysis of feature requirements for personalization together with a data collection tool for generating user models and collecting data for personalization of search and recommender system optimization in the long-tail. Based on the feature analysis, we provide a feature-rich publicly available data set, covering web content consumption and creation tasks. Our data set contains user models for eight users, including performed tasks, relevant topics for each task, relevance ratings, and relations between focus text and search queries. Altogether, the data set consists of 217 tasks, 4562 queries and over 15.000 ratings. On this data we perform automatic query prediction from web page content, achieving an accuracy of 89% using term identity, capitalization and part-of-speech tags as features. The results of the feature analysis can serve as guideline for feature collection for long-tail content personalization, and the provided data set as a gold standard for learning and evaluation of user models as well as for optimizing recommender or search engines for long-tail domains

    Enhancing the influence of pop-up advertisements on advertising effects from the perspective of personalization and placement

    Get PDF
    This study examined the influence of personalized pop-up advertising and ad placement on ad effects. Moreover, the moderator of product involvement on the influence of personalized pop-up ads was investigated. A 2 (ad type: personalized pop-up vs. non-personalized pop-up ad) × 2 (ad placement: initial webpage vs. middle webpage) experiment was conducted to examine how personalized pop-up advertising impacts ad attitude and recall, and how it interacts with different degrees of product involvement. Total valid experimental data derived from 296 participants showed that (1) Personalized pop-up ads were better than non- personalized pop-up ads in terms of ad attitude and ad recall; (2) There was no significant difference in ad attitude and ad recall of the personalized and non-personalized pop-up ads on the initial or the middle webpage. However, the influence of personalized pop-up ads on ad attitude but not on ad recall was significant for different types of webpage involvement; (3) Contrary to the hypothesis, the personalized ad had a significant effect on ad attitude when individuals had high rather than low product involvement. However, there was no significant difference in ad recall in either the low or high product involvement conditions
    • 

    corecore