4,256 research outputs found

    The power of A/B testing under interference

    Full text link
    In this paper, we address the fundamental statistical question: how can you assess the power of an A/B test when the units in the study are exposed to interference? This question is germane to many scientific and industrial practitioners that rely on A/B testing in environments where control over interference is limited. We begin by proving that interference has a measurable effect on its sensitivity, or power. We quantify the power of an A/B test of equality of means as a function of the number of exposed individuals under any interference mechanism. We further derive a central limit theorem for the number of exposed individuals under a simple Bernoulli switching interference mechanism. Based on these results, we develop a strategy to estimate the power of an A/B test when actors experience interference according to an observed network model. We demonstrate how to leverage this theory to estimate the power of an A/B test on units sharing any network relationship, and highlight the utility of our method on two applications - a Facebook friendship network as well as a large Twitter follower network. These results yield, for the first time, the capacity to understand how to design an A/B test to detect, with a specified confidence, a fixed measurable treatment effect when the A/B test is conducted under interference driven by networks.Comment: 14 page

    Continuous Monitoring of A/B Tests without Pain: Optional Stopping in Bayesian Testing

    Full text link
    A/B testing is one of the most successful applications of statistical theory in modern Internet age. One problem of Null Hypothesis Statistical Testing (NHST), the backbone of A/B testing methodology, is that experimenters are not allowed to continuously monitor the result and make decision in real time. Many people see this restriction as a setback against the trend in the technology toward real time data analytics. Recently, Bayesian Hypothesis Testing, which intuitively is more suitable for real time decision making, attracted growing interest as an alternative to NHST. While corrections of NHST for the continuous monitoring setting are well established in the existing literature and known in A/B testing community, the debate over the issue of whether continuous monitoring is a proper practice in Bayesian testing exists among both academic researchers and general practitioners. In this paper, we formally prove the validity of Bayesian testing with continuous monitoring when proper stopping rules are used, and illustrate the theoretical results with concrete simulation illustrations. We point out common bad practices where stopping rules are not proper and also compare our methodology to NHST corrections. General guidelines for researchers and practitioners are also provided

    Beyond A/B Testing: Sequential Randomization for Developing Interventions in Scaled Digital Learning Environments

    Full text link
    Randomized experiments ensure robust causal inference that are critical to effective learning analytics research and practice. However, traditional randomized experiments, like A/B tests, are limiting in large scale digital learning environments. While traditional experiments can accurately compare two treatment options, they are less able to inform how to adapt interventions to continually meet learners' diverse needs. In this work, we introduce a trial design for developing adaptive interventions in scaled digital learning environments -- the sequential randomized trial (SRT). With the goal of improving learner experience and developing interventions that benefit all learners at all times, SRTs inform how to sequence, time, and personalize interventions. In this paper, we provide an overview of SRTs, and we illustrate the advantages they hold compared to traditional experiments. We describe a novel SRT run in a large scale data science MOOC. The trial results contextualize how learner engagement can be addressed through inclusive culturally targeted reminder emails. We also provide practical advice for researchers who aim to run their own SRTs to develop adaptive interventions in scaled digital learning environments
    corecore