33 research outputs found
On Experimentation in Software-Intensive Systems
Context: Delivering software that has value to customers is a primary concern of every software company. Prevalent in web-facing companies, controlled experiments are used to validate and deliver value in incremental deployments. At the same that web-facing companies are aiming to automate and reduce the cost of each experiment iteration, embedded systems companies are starting to adopt experimentation practices and leverage their activities on the automation developments made in the online domain. Objective: This thesis has two main objectives. The first objective is to analyze how software companies can run and optimize their systems through automated experiments. This objective is investigated from the perspectives of the software architecture, the algorithms for the experiment execution and the experimentation process. The second objective is to analyze how non web-facing companies can adopt experimentation as part of their development process to validate and deliver value to their customers continuously. This objective is investigated from the perspectives of the software development process and focuses on the experimentation aspects that are distinct from web-facing companies. Method: To achieve these objectives, we conducted research in close collaboration with industry and used a combination of different empirical research methods: case studies, literature reviews, simulations, and empirical evaluations. Results: This thesis provides six main results. First, it proposes an architecture framework for automated experimentation that can be used with different types of experimental designs in both embedded systems and web-facing systems. Second, it proposes a new experimentation process to capture the details of a trustworthy experimentation process that can be used as the basis for an automated experimentation process. Third, it identifies the restrictions and pitfalls of different multi-armed bandit algorithms for automating experiments in industry. This thesis also proposes a set of guidelines to help practitioners select a technique that minimizes the occurrence of these pitfalls. Fourth, it proposes statistical models to analyze optimization algorithms that can be used in automated experimentation. Fifth, it identifies the key challenges faced by embedded systems companies when adopting controlled experimentation, and we propose a set of strategies to address these challenges. Sixth, it identifies experimentation techniques and proposes a new continuous experimentation model for mission-critical and business-to-business. Conclusion: The results presented in this thesis indicate that the trustworthiness in the experimentation process and the selection of algorithms still need to be addressed before automated experimentation can be used at scale in industry. The embedded systems industry faces challenges in adopting experimentation as part of its development process. In part, this is due to the low number of users and devices that can be used in experiments and the diversity of the required experimental designs for each new situation. This limitation increases both the complexity of the experimentation process and the number of techniques used to address this constraint
Towards Automated Experiments in Software Intensive Systems
Context: Delivering software that has value to customers is a primary concern of every software company. One of the techniques to continuously validate and deliver value in online software systems is the use of controlled experiments. The time cost of each experiment iteration, the increasing growth in the development organization to run experiments and the need for a more automated and systematic approach is leading companies to look for different techniques to automate the experimentation process. Objective: The overall objective of this thesis is to analyze how to automate different types of experiments and how companies can support and optimize their systems through automated experiments. This thesis explores the topic of automated online experiments from the perspectives of the software architecture, the algorithms for the experiment execution and the experimentation process, and focuses on two main application domains: the online and the embedded systems domain. Method: To achieve the objective, we conducted this research in close collaboration with industry using a combination of different empirical research methods: case studies, literature reviews, simulations and empirical evaluations. Results and conclusions: This thesis provides five main results. First, we propose an architecture framework for automated experimentation that can be used with different types of experimental designs in both embedded systems and web-facing systems. Second, we identify the key challenges faced by embedded systems companies when adopting controlled experimentation and we propose a set of strategies to address these challenges. Third, we develop a new algorithm for online experiments. Fourth, we identify restrictions and pitfalls of different algorithms for automating experiments in industry and we propose a set of guidelines to help practitioners select a technique that minimizes the occurrence of these pitfalls. Fifth, we propose a new experimentation process to capture the details of a trustworthy experimentation process that can be used as basis for an automated experimentation process. Future work: In future work, we plan to investigate how embedded systems can incorporate experiments in their development process without compromising existing real-time and safety requirements. We also plan to analyze the impact and costs of automating the different parts of the experimentation process
Bayesian paired comparison with the bpcs package
This article introduces the bpcs R package (Bayesian Paired Comparison in Stan) and the statistical models implemented in the package. This package aims to facilitate the use of Bayesian models for paired comparison data in behavioral research. Bayesian analysis of paired comparison data allows parameter estimation even in conditions where the maximum likelihood does not exist, allows easy extension of paired comparison models, provides straightforward interpretation of the results with credible intervals, has better control of type I error, has more robust evidence towards the null hypothesis, allows propagation of uncertainties, includes prior information, and performs well when handling models with many parameters and latent variables. The bpcs package provides a consistent interface for R users and several functions to evaluate the posterior distribution of all parameters to estimate the posterior distribution of any contest between items and to obtain the posterior distribution of the ranks. Three reanalyses of recent studies that used the frequentist Bradley–Terry model are presented. These reanalyses are conducted with the Bayesian models of the bpcs package, and all the code used to fit the models, generate the figures, and the tables are available in the online appendix
Statistical Models for the Analysis of Optimization Algorithms with Benchmark Functions
Frequentist statistical methods, such as hypothesis testing, are standard
practice in papers that provide benchmark comparisons. Unfortunately, these
methods have often been misused, e.g., without testing for their statistical
test assumptions or without controlling for family-wise errors in multiple
group comparisons, among several other problems. Bayesian Data Analysis (BDA)
addresses many of the previously mentioned shortcomings but its use is not
widely spread in the analysis of empirical data in the evolutionary computing
community. This paper provides three main contributions. First, we motivate the
need for utilizing Bayesian data analysis and provide an overview of this
topic. Second, we discuss the practical aspects of BDA to ensure that our
models are valid and the results transparent. Finally, we provide five
statistical models that can be used to answer multiple research questions. The
online appendix provides a step-by-step guide on how to perform the analysis of
the models discussed in this paper, including the code for the statistical
models, the data transformations and the discussed tables and figures.Comment: In submissio
From Ad-Hoc Data Analytics to DataOps
The collection of high-quality data provides a key competitive advantage to companies in their decision-making process. It helps to understand customer behavior and enables the usage and deployment of new technologies based on machine learning. However, the process from collecting the data, to clean and process it to be used by data scientists and applications is often manual, non-optimized and error-prone. This increases the time that the data takes to deliver value for the business. To reduce this time companies are looking into automation and validation of the data processes. Data processes are the operational side of data analytic workflow.DataOps, a recently coined term by data scientists, data analysts and data engineers refer to a general process aimed to shorten the end-to-end data analytic life-cycle time by introducing automation in the data collection, validation, and verification process. Despite its increasing popularity among practitioners, research on this topic has been limited and does not provide a clear definition for the term or how a data analytic process evolves from ad-hoc data collection to fully automated data analytics as envisioned by DataOps.This research provides three main contributions. First, utilizing multi-vocal literature we provide a definition and a scope for the general process referred to as DataOps. Second, based on a case study with a large mobile telecommunication organization, we analyze how multiple data analytic teams evolve their infrastructure and processes towards DataOps. Also, we provide a stairway showing the different stages of the evolution process. With this evolution model, companies can identify the stage which they belong to and also, can try to move to the next stage by overcoming the challenges they encounter in the current stage
Engineering for a Science-Centric Experimentation Platform
Netflix is an internet entertainment service that routinely employs
experimentation to guide strategy around product innovations. As Netflix grew,
it had the opportunity to explore increasingly specialized improvements to its
service, which generated demand for deeper analyses supported by richer metrics
and powered by more diverse statistical methodologies. To facilitate this, and
more fully harness the skill sets of both engineering and data science, Netflix
engineers created a science-centric experimentation platform that leverages the
expertise of data scientists from a wide range of backgrounds by allowing them
to make direct code contributions in the languages used by scientists (Python
and R). Moreover, the same code that runs in production is able to be run
locally, making it straightforward to explore and graduate both metrics and
causal inference methodologies directly into production services.
In this paper, we utilize a case-study research method to provide two main
contributions. Firstly, we report on the architecture of this platform, with a
special emphasis on its novel aspects: how it supports science-centric
end-to-end workflows without compromising engineering requirements. Secondly,
we describe its approach to causal inference, which leverages the potential
outcomes conceptual framework to provide a unified abstraction layer for
arbitrary statistical models and methodologies.Comment: 10 page
Engineering for a science-centric experimentation platform
Netflix is an internet entertainment service that routinely employs experimentation to guide strategy around product innovations. As Netflix grew, it had the opportunity to explore increasingly specialized improvements to its service, which generated demand for deeper analyses supported by richer metrics and powered by more diverse statistical methodologies. To facilitate this, and more fully harness the skill sets of both engineering and data science, Netflix engineers created a science-centric experimentation platform that leverages the expertise of scientists from a wide range of backgrounds working on data science tasks by allowing them to make direct code contributions in the languages used by them (Python and R). Moreover, the same code that runs in production is able to be run locally, making it straightforward to explore and graduate both metrics and causal inference methodologies directly into production services. In this paper, we provide two main contributions. Firstly, we report on the architecture of this platform, with a special emphasis on its novel aspects: how it supports science-centric end-to-end workflows without compromising engineering requirements. Secondly, we describe its approach to causal inference, which leverages the potential outcomes conceptual framework to provide a unified abstarction layer for arbitrary statistical models and methodologies
Cookie Experiment 2020 (DAT246/DIT278)
This is an example of a experiment to find out the best cookie recipe in the fictional company AAA
Success Factors when Transitioning to Continuous Deployment in Software-Intensive Embedded Systems
Continuous Deployment is the practice to deploy software more frequently to customers and learn from their usage. The aim is to introduce new functionality and features in an additive way to customers as soon as possible. While Continuous Deployment is becoming popular among web and cloud-based software development organizations, the adoption of continuous deployment within the software-intensive embedded systems industry is still limited.In this paper, we conducted a case study at a multinational telecommunications company focusing on the Third Generation Radio Access Network (3G RAN) embedded software. The organization has transitioned to Continuous Deployment where the software\u27s deployment cycle has been reduced to 4 weeks from 24 weeks. The objective of this paper is to identify what does success means when transitioning to continuous deployment and the success factors that companies need to attend to when transitioning to continuous deployment in a large-scale embedded software