4,256 research outputs found
The power of A/B testing under interference
In this paper, we address the fundamental statistical question: how can you
assess the power of an A/B test when the units in the study are exposed to
interference? This question is germane to many scientific and industrial
practitioners that rely on A/B testing in environments where control over
interference is limited. We begin by proving that interference has a measurable
effect on its sensitivity, or power. We quantify the power of an A/B test of
equality of means as a function of the number of exposed individuals under any
interference mechanism. We further derive a central limit theorem for the
number of exposed individuals under a simple Bernoulli switching interference
mechanism. Based on these results, we develop a strategy to estimate the power
of an A/B test when actors experience interference according to an observed
network model. We demonstrate how to leverage this theory to estimate the power
of an A/B test on units sharing any network relationship, and highlight the
utility of our method on two applications - a Facebook friendship network as
well as a large Twitter follower network. These results yield, for the first
time, the capacity to understand how to design an A/B test to detect, with a
specified confidence, a fixed measurable treatment effect when the A/B test is
conducted under interference driven by networks.Comment: 14 page
Continuous Monitoring of A/B Tests without Pain: Optional Stopping in Bayesian Testing
A/B testing is one of the most successful applications of statistical theory
in modern Internet age. One problem of Null Hypothesis Statistical Testing
(NHST), the backbone of A/B testing methodology, is that experimenters are not
allowed to continuously monitor the result and make decision in real time. Many
people see this restriction as a setback against the trend in the technology
toward real time data analytics. Recently, Bayesian Hypothesis Testing, which
intuitively is more suitable for real time decision making, attracted growing
interest as an alternative to NHST. While corrections of NHST for the
continuous monitoring setting are well established in the existing literature
and known in A/B testing community, the debate over the issue of whether
continuous monitoring is a proper practice in Bayesian testing exists among
both academic researchers and general practitioners. In this paper, we formally
prove the validity of Bayesian testing with continuous monitoring when proper
stopping rules are used, and illustrate the theoretical results with concrete
simulation illustrations. We point out common bad practices where stopping
rules are not proper and also compare our methodology to NHST corrections.
General guidelines for researchers and practitioners are also provided
Beyond A/B Testing: Sequential Randomization for Developing Interventions in Scaled Digital Learning Environments
Randomized experiments ensure robust causal inference that are critical to
effective learning analytics research and practice. However, traditional
randomized experiments, like A/B tests, are limiting in large scale digital
learning environments. While traditional experiments can accurately compare two
treatment options, they are less able to inform how to adapt interventions to
continually meet learners' diverse needs. In this work, we introduce a trial
design for developing adaptive interventions in scaled digital learning
environments -- the sequential randomized trial (SRT). With the goal of
improving learner experience and developing interventions that benefit all
learners at all times, SRTs inform how to sequence, time, and personalize
interventions. In this paper, we provide an overview of SRTs, and we illustrate
the advantages they hold compared to traditional experiments. We describe a
novel SRT run in a large scale data science MOOC. The trial results
contextualize how learner engagement can be addressed through inclusive
culturally targeted reminder emails. We also provide practical advice for
researchers who aim to run their own SRTs to develop adaptive interventions in
scaled digital learning environments
- …