Search CORE

8 research outputs found

ケイサンキシステムノアンテイテキナカドウニカンスルソフトウェアノケンキュウ

Author: ウエダリョウイチ
植田良一
Publication venue: 'Springer Publishing Company'
Publication date
Field of study

Software Microbenchmarking in the Cloud. How Bad is it Really?

Author: Laaber Christoph
Leitner Philipp
Scheuner Joel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Rigorous performance engineering traditionally assumes measuring on bare-metal environments to control for as many confounding factors as possible. Unfortunately, some researchers and practitioners might not have access, knowledge, or funds to operate dedicated performance-testing hardware, making public clouds an attractive alternative. However, shared public cloud environments are inherently unpredictable in terms of the system performance they provide. In this study, we explore the effects of cloud environments on the variability of performance test results and to what extent slowdowns can still be reliably detected even in a public cloud. We focus on software microbenchmarks as an example of performance tests and execute extensive experiments on three different well-known public cloud services (AWS, GCE, and Azure) using three different cloud instance types per service. We also compare the results to a hosted bare-metal offering from IBM Bluemix. In total, we gathered more than 4.5 million unique microbenchmarking data points from benchmarks written in Java and Go. We find that the variability of results differs substantially between benchmarks and instance types (by a coefficient of variation from 0.03% to > 100%). However, executing test and control experiments on the same instances (in randomized order) allows us to detect slowdowns of 10% or less with high confidence, using state-of-the-art statistical tests (i.e., Wilcoxon rank-sum and overlapping bootstrapped confidence intervals). Finally, our results indicate that Wilcoxon rank-sum manages to detect smaller slowdowns in cloud environments

Chalmers Research

ZORA

Applying test case prioritization to software microbenchmarks

Author: Gall Harald C.
Laaber Christoph
Leitner Philipp
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Regression testing comprises techniques which are applied during software evolution to uncover faults effectively and efficiently. While regression testing is widely studied for functional tests, performance regression testing, e.g., with software microbenchmarks, is hardly investigated. Applying test case prioritization (TCP), a regression testing technique, to software microbenchmarks may help capturing large performance regressions sooner upon new versions. This may especially be beneficial for microbenchmark suites, because they take considerably longer to execute than unit test suites. However, it is unclear whether traditional unit testing TCP techniques work equally well for software microbenchmarks. In this paper, we empirically study coverage-based TCP techniques, employing total and additional greedy strategies, applied to software microbenchmarks along multiple parameterization dimensions, leading to 54 unique technique instantiations. We find that TCP techniques have a mean APFD-P (average percentage of fault-detection on performance) effectiveness between 0.54 and 0.71 and are able to capture the three largest performance changes after executing 29% to 66% of the whole microbenchmark suite. Our efficiency analysis reveals that the runtime overhead of TCP varies considerably depending on the exact parameterization. The most effective technique has an overhead of 11% of the total microbenchmark suite execution time, making TCP a viable option for performance regression testing. The results demonstrate that the total strategy is superior to the additional strategy. Finally, dynamic-coverage techniques should be favored over static-coverage techniques due to their acceptable analysis overhead; however, in settings where the time for prioritzation is limited, static-coverage techniques provide an attractive alternative

Chalmers Research

Towards the detection and analysis of performance regression introducing code changes

Author: ALShoaibi Deema Adeeb
Publication venue: RIT Scholar Works
Publication date: 01/11/2022
Field of study

In contemporary software development, developers commonly conduct regression testing to ensure that code changes do not affect software quality. Performance regression testing is an emerging research area from the regression testing domain in software engineering. Performance regression testing aims to maintain the system\u27s performance. Conducting performance regression testing is known to be expensive. It is also complex, considering the increase of committed code and developing team members working simultaneously. Many automated regression testing techniques have been proposed in prior research. However, challenges in the practice of locating and resolving performance regression still exist. Directing regression testing to the commit level provides solutions to locate the root cause, yet it hinders the development process. This thesis outlines motivations and solutions to address locating performance regression root causes. First, we challenge a deterministic state-of-art approach by expanding the testing data to find improvement areas. The deterministic approach was found to be limited in searching for the best regression-locating rule. Thus, we presented two stochastic approaches to develop models that can learn from historical commits. The goal of the first stochastic approach is to view the research problem as a search-based optimization problem seeking to reach the highest detection rate. We are applying different multi-objective evolutionary algorithms and conducting a comparison between them. This thesis also investigates whether simplifying the search space by combining objectives would achieve comparative results. The second stochastic approach addresses the severity of class imbalance any system could have since code changes introducing regression are rare but costly. We formulate the identification of problematic commits that introduce performance regression as a binary classification problem that handles class imbalance. Further, the thesis provides an exploratory study on the challenges developers face in resolving performance regression. The study is based on the questions posted on a technical form directed to performance regression. We collected around 2k questions discussing the regression of software execution time, and all were manually analyzed. The study resulted in a categorization of the challenges. We also discussed the difficulty level of performance regression issues within the development community. This study provides insights to help developers during the software design and implementation to avoid regression causes

RIT Scholar Works

Release-Aware and Prioritized Bug-Fixing Task Assignment Optimization

Author: Kashiwa Yutaro
Publication venue
Publication date: 28/05/2021
Field of study

Wakayama University Academic Repository

Performance Test Selection Using Machine Learning and a Study of Binning Effect in Memory Allocators

Author: Oliveira Sousa Anderson
Publication venue: 'University of Waterloo'
Publication date: 30/04/2019
Field of study

Performance testing is an essential part of the development life cycle that must be done in a timely fashion. However, checking for performance regressions in software can be time-consuming, especially for complex systems containing multiple lengthy tests cases. The first part of this thesis presents a technique to performance test selection using machine learning. In our approach, we build features using information extracted from the previous software versions to train classifiers that assist developers in deciding whether or not to execute a performance test on a new version. Our results show that the classifiers can be used as a mechanism that aids test selection and consequently avoids unnecessary testing. The second part of this work investigates the binning effect on user-space memory allocators. First, we examine how binning events can be a source of performance outliers in Redis and CPython object allocators. Second, we implement a \textit{Pintool} to detect the occurrence of binning on Python programs. The tool performs dynamic binary instrumentation on the interpreter and outputs information that helps developers in performing code optimizations. Finally, we use our tool to investigate the presence of binning in various widely used Python libraries

University of Waterloo's Institutional Repository

An industrial case study of automatically identifying performance regression-causes

Author
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

Crossref

Software Tracing Comparison Using Data Mining Techniques

Author: De Melo Junior Isnaldo Francisco
Publication venue
Publication date: 01/08/2017
Field of study

La performance est devenue une question cruciale sur le développement, le test et la maintenance des logiciels. Pour répondre à cette préoccupation, les développeurs et les testeurs utilisent plusieurs outils pour améliorer les performances ou suivre les bogues liés à la performance. L’utilisation de méthodologies comparatives telles que Flame Graphs fournit un moyen formel de vérifier les causes des régressions et des problèmes de performance. L’outil de comparaison fournit des informations pour l’analyse qui peuvent être utilisées pour les améliorer par un mécanisme de profilage profond, comparant habituellement une donnée normale avec un profil anormal. D’autre part, le mécanisme de traçage est un mécanisme de tendance visant à enregistrer des événements dans le système et à réduire les frais généraux de son utilisation. Le registre de cette information peut être utilisé pour fournir aux développeurs des données pour l’analyse de performance. Cependant, la quantité de données fournies et les connaissances requises à comprendre peuvent constituer un défi pour les méthodes et les outils d’analyse actuels. La combinaison des deux méthodologies, un mécanisme comparatif de profilage et un système de traçabilité peu élevé peut permettre d’évaluer les causes des problèmes répondant également à des exigences de performance strictes en même temps. La prochaine étape consiste à utiliser ces données pour développer des méthodes d’analyse des causes profondes et d’identification des goulets d’étranglement. L’objectif de ce recherche est d’automatiser le processus d’analyse des traces et d’identifier automatiquement les différences entre les groupes d’exécutions. La solution présentée souligne les différences dans les groupes présentant une cause possible de cette différence, l’utilisateur peut alors bénéficier de cette revendication pour améliorer les exécutions. Nous présentons une série de techniques automatisées qui peuvent être utilisées pour trouver les causes profondes des variations de performance et nécessitant des interférences mineures ou non humaines. L’approche principale est capable d’indiquer la performance en utilisant une méthodologie de regroupement comparative sur les exécutions et a été appliquée sur des cas d’utilisation réelle. La solution proposée a été mise en oeuvre sur un cadre d’analyse pour aider les développeurs à résoudre des problèmes similaires avec un outil différentiel de flamme. À notre connaissance, il s’agit de la première tentative de corréler les mécanismes de regroupement automatique avec l’analyse des causes racines à l’aide des données de suivi. Dans ce projet, la plupart des données utilisées pour les évaluations et les expériences ont été effectuées dans le système d’exploitation Linux et ont été menées à l’aide de Linux Trace Toolkit Next Generation (LTTng) qui est un outil très flexible avec de faibles coûts généraux.----------ABSTRACT: Performance has become a crucial matter in software development, testing and maintenance. To address this concern, developers and testers use several tools to improve the performance or track performance related bugs. The use of comparative methodologies such as Flame Graphs provides a formal way to verify causes of regressions and performance issues. The comparison tool provides information for analysis that can be used to improve the study by a deep profiling mechanism, usually comparing normal with abnormal profiling data. On the other hand, Tracing is a popular mechanism, targeting to record events in the system and to reduce the overhead associated with its utilization. The record of this information can be used to supply developers with data for performance analysis. However, the amount of data provided, and the required knowledge to understand it, may present a challenge for the current analysis methods and tools. Combining both methodologies, a comparative mechanism for profiling and a low overhead trace system, can enable the easier evaluation of issues and underlying causes, also meeting stringent performance requirements at the same time. The next step is to use this data to develop methods for root cause analysis and bottleneck identification. The objective of this research project is to automate the process of trace analysis and automatic identification of differences among groups of executions. The presented solution highlights differences in the groups, presenting a possible cause for any difference. The user can then benefit from this claim to improve the executions. We present a series of automated techniques that can be used to find the root causes of performance variations, while requiring small or no human intervention. The main approach is capable to identify the performance difference cause using a comparative grouping methodology on the executions, and was applied to real use cases. The proposed solution was implemented on an analysis framework to help developers with similar problems, together with a differential flame graph tool. To our knowledge, this is the first attempt to correlate automatic grouping mechanisms with root cause analysis using tracing data. In this project, most of the data used for evaluations and experiments were done with the Linux Operating System and were conducted using the Linux Trace Toolkit Next Generation (LTTng), which is a very flexible tool with low overhead

PolyPublie