682 research outputs found

    Addressing Hidden Imperfections in Online Experimentation

    Full text link
    Technology companies are increasingly using randomized controlled trials (RCTs) as part of their development process. Despite having fine control over engineering systems and data instrumentation, these RCTs can still be imperfectly executed. In fact, online experimentation suffers from many of the same biases seen in biomedical RCTs including opt-in and user activity bias, selection bias, non-compliance with the treatment, and more generally, challenges in the ability to test the question of interest. The result of these imperfections can lead to a bias in the estimated causal effect, a loss in statistical power, an attenuation of the effect, or even a need to reframe the question that can be answered. This paper aims to make practitioners of experimentation more aware of imperfections in technology-industry RCTs, which can be hidden throughout the engineering stack or in the design process.Comment: Presented at CODE@MIT 202

    Online experimentation in automotive software engineering

    Get PDF
    Context: Online experimentation has long been the gold standard for evaluating software towards the actual needs and preferences of customers. In the Software-as-a-Service domain, various online experimentation techniques are applied and proven successful. As software is becoming the main differentiator for automotive products, the automotive sector has started to express an interest in adopting online experimentation to strengthen their software development process. Objective: In this research, we aim to systematically address the challenges in adopting online experimentation in the automotive domain.Method: We apply a multidisciplinary approach to this research. To understand the state-of-practise in online experimentation in the industry, we conduct case studies with three manufacturers. We introduce our experimental design and evaluation methods to real vehicles driven by customers at scale. Moreover, we run experiments to quantitatively evaluate experiment design and causal inference models. Results: Four main research outcomes are presented in this thesis. First, we propose an architecture for continuous online experimentation given the limitations experienced in the automotive domain. Second, after identifying an inherent limitation of sample sizes in the automotive domain, we apply and evaluate an experimentation design method. The method allows us to utilise pre-experimental data for generating balanced groups even when sample sizes are limited. Third, we present an alternative approach to randomised experiments and demonstrate the application of Bayesian causal inference in online software evaluation. With the models, we enable software online evaluation without the need for a fully randomised experiment. Finally, we relate the formal assumption in the Bayesian causal models to the implications in practise, and we demonstrate the inference models with cases from the automotive domain. Outlook: In our future work, we plan to explore causal structural and graphical models applied in software engineering, and demonstrate the application of causal discovery in machine learning-based autonomous drive software

    Adobe Flash as a medium for online experimentation: a test of reaction time measurement capabilities

    Get PDF
    Adobe Flash can be used to run complex psychological experiments over the Web. We examined the reliability of using Flash to measure reaction times (RTs) using a simple binary-choice task implemented both in Flash and in a Linux-based system known to record RTs with millisecond accuracy. Twenty-four participants were tested in the laboratory using both implementations; they also completed the Flash version on computers of their own choice outside the lab. RTs from the trials run on Flash outside the lab were approximately 20 msec slower than those from trials run on Flash in the lab, which in turn were approximately 10 msec slower than RTs from the trials run on the Linux-based system (baseline condition). RT SDs were similar in all conditions, suggesting that although Flash may overestimate RTs slightly, it does not appear to add significant noise to the data recorded

    TSEC: a framework for online experimentation under experimental constraints

    Full text link
    Thompson sampling is a popular algorithm for solving multi-armed bandit problems, and has been applied in a wide range of applications, from website design to portfolio optimization. In such applications, however, the number of choices (or arms) NN can be large, and the data needed to make adaptive decisions require expensive experimentation. One is then faced with the constraint of experimenting on only a small subset of KNK \ll N arms within each time period, which poses a problem for traditional Thompson sampling. We propose a new Thompson Sampling under Experimental Constraints (TSEC) method, which addresses this so-called "arm budget constraint". TSEC makes use of a Bayesian interaction model with effect hierarchy priors, to model correlations between rewards on different arms. This fitted model is then integrated within Thompson sampling, to jointly identify a good subset of arms for experimentation and to allocate resources over these arms. We demonstrate the effectiveness of TSEC in two problems with arm budget constraints. The first is a simulated website optimization study, where TSEC shows noticeable improvements over industry benchmarks. The second is a portfolio optimization application on industry-based exchange-traded funds, where TSEC provides more consistent and greater wealth accumulation over standard investment strategies

    Bots as Virtual Confederates: Design and Ethics

    Full text link
    The use of bots as virtual confederates in online field experiments holds extreme promise as a new methodological tool in computational social science. However, this potential tool comes with inherent ethical challenges. Informed consent can be difficult to obtain in many cases, and the use of confederates necessarily implies the use of deception. In this work we outline a design space for bots as virtual confederates, and we propose a set of guidelines for meeting the status quo for ethical experimentation. We draw upon examples from prior work in the CSCW community and the broader social science literature for illustration. While a handful of prior researchers have used bots in online experimentation, our work is meant to inspire future work in this area and raise awareness of the associated ethical issues.Comment: Forthcoming in CSCW 201

    Online experimentation and interactive learning resources for teaching network engineering

    Get PDF
    This paper presents a case study on teaching network engineering in conjunction with interactive learning resources. This case study has been developed in collaboration with the Cisco Networking Academy in the context of the FORGE project, which promotes online learning and experimentation by offering access to virtual and remote labs. The main goal of this work is allowing learners and educators to perform network simulations within a web browser or an interactive eBook by using any type of mobile, tablet or desktop device. Learning Analytics are employed in order to monitor learning behaviour for further analysis of the learning experience offered to students
    corecore