682 research outputs found
Addressing Hidden Imperfections in Online Experimentation
Technology companies are increasingly using randomized controlled trials
(RCTs) as part of their development process. Despite having fine control over
engineering systems and data instrumentation, these RCTs can still be
imperfectly executed. In fact, online experimentation suffers from many of the
same biases seen in biomedical RCTs including opt-in and user activity bias,
selection bias, non-compliance with the treatment, and more generally,
challenges in the ability to test the question of interest. The result of these
imperfections can lead to a bias in the estimated causal effect, a loss in
statistical power, an attenuation of the effect, or even a need to reframe the
question that can be answered. This paper aims to make practitioners of
experimentation more aware of imperfections in technology-industry RCTs, which
can be hidden throughout the engineering stack or in the design process.Comment: Presented at CODE@MIT 202
Online experimentation in automotive software engineering
Context: Online experimentation has long been the gold standard for evaluating software towards the actual needs and preferences of customers. In the Software-as-a-Service domain, various online experimentation techniques are applied and proven successful. As software is becoming the main differentiator for automotive products, the automotive sector has started to express an interest in adopting online experimentation to strengthen their software development process. Objective: In this research, we aim to systematically address the challenges in adopting online experimentation in the automotive domain.Method: We apply a multidisciplinary approach to this research. To understand the state-of-practise in online experimentation in the industry, we conduct case studies with three manufacturers. We introduce our experimental design and evaluation methods to real vehicles driven by customers at scale. Moreover, we run experiments to quantitatively evaluate experiment design and causal inference models. Results: Four main research outcomes are presented in this thesis. First, we propose an architecture for continuous online experimentation given the limitations experienced in the automotive domain. Second, after identifying an inherent limitation of sample sizes in the automotive domain, we apply and evaluate an experimentation design method. The method allows us to utilise pre-experimental data for generating balanced groups even when sample sizes are limited. Third, we present an alternative approach to randomised experiments and demonstrate the application of Bayesian causal inference in online software evaluation. With the models, we enable software online evaluation without the need for a fully randomised experiment. Finally, we relate the formal assumption in the Bayesian causal models to the implications in practise, and we demonstrate the inference models with cases from the automotive domain. Outlook: In our future work, we plan to explore causal structural and graphical models applied in software engineering, and demonstrate the application of causal discovery in machine learning-based autonomous drive software
Adobe Flash as a medium for online experimentation: a test of reaction time measurement capabilities
Adobe Flash can be used to run complex psychological experiments over the Web. We examined the reliability of using Flash to measure reaction times (RTs) using a simple binary-choice task implemented both in Flash and in a Linux-based system known to record RTs with millisecond accuracy. Twenty-four participants were tested in the laboratory using both implementations; they also completed the Flash version on computers of their own choice outside the lab. RTs from the trials run on Flash outside the lab were approximately 20 msec slower than those from trials run on Flash in the lab, which in turn were approximately 10 msec slower than RTs from the trials run on the Linux-based system (baseline condition). RT SDs were similar in all conditions, suggesting that although Flash may overestimate RTs slightly, it does not appear to add significant noise to the data recorded
TSEC: a framework for online experimentation under experimental constraints
Thompson sampling is a popular algorithm for solving multi-armed bandit
problems, and has been applied in a wide range of applications, from website
design to portfolio optimization. In such applications, however, the number of
choices (or arms) can be large, and the data needed to make adaptive
decisions require expensive experimentation. One is then faced with the
constraint of experimenting on only a small subset of arms within
each time period, which poses a problem for traditional Thompson sampling. We
propose a new Thompson Sampling under Experimental Constraints (TSEC) method,
which addresses this so-called "arm budget constraint". TSEC makes use of a
Bayesian interaction model with effect hierarchy priors, to model correlations
between rewards on different arms. This fitted model is then integrated within
Thompson sampling, to jointly identify a good subset of arms for
experimentation and to allocate resources over these arms. We demonstrate the
effectiveness of TSEC in two problems with arm budget constraints. The first is
a simulated website optimization study, where TSEC shows noticeable
improvements over industry benchmarks. The second is a portfolio optimization
application on industry-based exchange-traded funds, where TSEC provides more
consistent and greater wealth accumulation over standard investment strategies
Bots as Virtual Confederates: Design and Ethics
The use of bots as virtual confederates in online field experiments holds
extreme promise as a new methodological tool in computational social science.
However, this potential tool comes with inherent ethical challenges. Informed
consent can be difficult to obtain in many cases, and the use of confederates
necessarily implies the use of deception. In this work we outline a design
space for bots as virtual confederates, and we propose a set of guidelines for
meeting the status quo for ethical experimentation. We draw upon examples from
prior work in the CSCW community and the broader social science literature for
illustration. While a handful of prior researchers have used bots in online
experimentation, our work is meant to inspire future work in this area and
raise awareness of the associated ethical issues.Comment: Forthcoming in CSCW 201
Online experimentation and interactive learning resources for teaching network engineering
This paper presents a case study on teaching network engineering in conjunction with interactive learning resources. This case study has been developed in collaboration with the Cisco Networking Academy in the context of the FORGE project, which promotes online learning and experimentation by offering access to virtual and remote labs. The main goal of this work is allowing learners and educators to perform network simulations within a web browser or an interactive eBook by using any type of mobile, tablet or desktop device. Learning Analytics are employed in order to monitor learning behaviour for further analysis of the learning experience offered to students
- …