4 research outputs found
False Discovery Rate Controlled Heterogeneous Treatment Effect Detection for Online Controlled Experiments
Online controlled experiments (a.k.a. A/B testing) have been used as the
mantra for data-driven decision making on feature changing and product shipping
in many Internet companies. However, it is still a great challenge to
systematically measure how every code or feature change impacts millions of
users with great heterogeneity (e.g. countries, ages, devices). The most
commonly used A/B testing framework in many companies is based on Average
Treatment Effect (ATE), which cannot detect the heterogeneity of treatment
effect on users with different characteristics. In this paper, we propose
statistical methods that can systematically and accurately identify
Heterogeneous Treatment Effect (HTE) of any user cohort of interest (e.g.
mobile device type, country), and determine which factors (e.g. age, gender) of
users contribute to the heterogeneity of the treatment effect in an A/B test.
By applying these methods on both simulation data and real-world
experimentation data, we show how they work robustly with controlled low False
Discover Rate (FDR), and at the same time, provides us with useful insights
about the heterogeneity of identified user groups. We have deployed a toolkit
based on these methods, and have used it to measure the Heterogeneous Treatment
Effect of many A/B tests at Snap