9,608 research outputs found
Online Model Evaluation in a Large-Scale Computational Advertising Platform
Online media provides opportunities for marketers through which they can
deliver effective brand messages to a wide range of audiences. Advertising
technology platforms enable advertisers to reach their target audience by
delivering ad impressions to online users in real time. In order to identify
the best marketing message for a user and to purchase impressions at the right
price, we rely heavily on bid prediction and optimization models. Even though
the bid prediction models are well studied in the literature, the equally
important subject of model evaluation is usually overlooked. Effective and
reliable evaluation of an online bidding model is crucial for making faster
model improvements as well as for utilizing the marketing budgets more
efficiently. In this paper, we present an experimentation framework for bid
prediction models where our focus is on the practical aspects of model
evaluation. Specifically, we outline the unique challenges we encounter in our
platform due to a variety of factors such as heterogeneous goal definitions,
varying budget requirements across different campaigns, high seasonality and
the auction-based environment for inventory purchasing. Then, we introduce
return on investment (ROI) as a unified model performance (i.e., success)
metric and explain its merits over more traditional metrics such as
click-through rate (CTR) or conversion rate (CVR). Most importantly, we discuss
commonly used evaluation and metric summarization approaches in detail and
propose a more accurate method for online evaluation of new experimental models
against the baseline. Our meta-analysis-based approach addresses various
shortcomings of other methods and yields statistically robust conclusions that
allow us to conclude experiments more quickly in a reliable manner. We
demonstrate the effectiveness of our evaluation strategy on real campaign data
through some experiments.Comment: Accepted to ICDM201
Statistical Challenges in Online Controlled Experiments: A Review of A/B Testing Methodology
The rise of internet-based services and products in the late 1990's brought
about an unprecedented opportunity for online businesses to engage in large
scale data-driven decision making. Over the past two decades, organizations
such as Airbnb, Alibaba, Amazon, Baidu, Booking, Alphabet's Google, LinkedIn,
Lyft, Meta's Facebook, Microsoft, Netflix, Twitter, Uber, and Yandex have
invested tremendous resources in online controlled experiments (OCEs) to assess
the impact of innovation on their customers and businesses. Running OCEs at
scale has presented a host of challenges requiring solutions from many domains.
In this paper we review challenges that require new statistical methodologies
to address them. In particular, we discuss the practice and culture of online
experimentation, as well as its statistics literature, placing the current
methodologies within their relevant statistical lineages and providing
illustrative examples of OCE applications. Our goal is to raise academic
statisticians' awareness of these new research opportunities to increase
collaboration between academia and the online industry
The Use of Clustering Methods in Memory-Based Collaborative Filtering for Ranking-Based Recommendation Systems
This research explores the application of clustering techniques and frequency normalization in collaborative filtering to enhance the performance of ranking-based recommendation systems. Collaborative filtering is a popular approach in recommendation systems that relies on user-item interaction data. In ranking-based recommendation systems, the goal is to provide users with a personalized list of items, sorted by their predicted relevance. In this study, we propose a novel approach that combines clustering and frequency normalization techniques. Clustering, in the context of data analysis, is a technique used to organize and group together users or items that share similar characteristics or features. This method proves beneficial in enhancing recommendation accuracy by uncovering hidden patterns within the data. Additionally, frequency normalization is utilized to mitigate potential biases in user-item interaction data, ensuring fair and unbiased recommendations. The research methodology involves data preprocessing, clustering algorithm selection, frequency normalization techniques, and evaluation metrics. Experimental results demonstrate that the proposed method outperforms traditional collaborative filtering approaches in terms of ranking accuracy and recommendation quality. This approach has the potential to enhance recommendation systems across various domains, including e-commerce, content recommendation, and personalized advertising
Engineering for a science-centric experimentation platform
Netflix is an internet entertainment service that routinely employs experimentation to guide strategy around product innovations. As Netflix grew, it had the opportunity to explore increasingly specialized improvements to its service, which generated demand for deeper analyses supported by richer metrics and powered by more diverse statistical methodologies. To facilitate this, and more fully harness the skill sets of both engineering and data science, Netflix engineers created a science-centric experimentation platform that leverages the expertise of scientists from a wide range of backgrounds working on data science tasks by allowing them to make direct code contributions in the languages used by them (Python and R). Moreover, the same code that runs in production is able to be run locally, making it straightforward to explore and graduate both metrics and causal inference methodologies directly into production services. In this paper, we provide two main contributions. Firstly, we report on the architecture of this platform, with a special emphasis on its novel aspects: how it supports science-centric end-to-end workflows without compromising engineering requirements. Secondly, we describe its approach to causal inference, which leverages the potential outcomes conceptual framework to provide a unified abstarction layer for arbitrary statistical models and methodologies
- …