18,325 research outputs found
Report from GI-Dagstuhl Seminar 16394: Software Performance Engineering in the DevOps World
This report documents the program and the outcomes of GI-Dagstuhl Seminar
16394 "Software Performance Engineering in the DevOps World".
The seminar addressed the problem of performance-aware DevOps. Both, DevOps
and performance engineering have been growing trends over the past one to two
years, in no small part due to the rise in importance of identifying
performance anomalies in the operations (Ops) of cloud and big data systems and
feeding these back to the development (Dev). However, so far, the research
community has treated software engineering, performance engineering, and cloud
computing mostly as individual research areas. We aimed to identify
cross-community collaboration, and to set the path for long-lasting
collaborations towards performance-aware DevOps.
The main goal of the seminar was to bring together young researchers (PhD
students in a later stage of their PhD, as well as PostDocs or Junior
Professors) in the areas of (i) software engineering, (ii) performance
engineering, and (iii) cloud computing and big data to present their current
research projects, to exchange experience and expertise, to discuss research
challenges, and to develop ideas for future collaborations
Is It Safe to Uplift This Patch? An Empirical Study on Mozilla Firefox
In rapid release development processes, patches that fix critical issues, or
implement high-value features are often promoted directly from the development
channel to a stabilization channel, potentially skipping one or more
stabilization channels. This practice is called patch uplift. Patch uplift is
risky, because patches that are rushed through the stabilization phase can end
up introducing regressions in the code. This paper examines patch uplift
operations at Mozilla, with the aim to identify the characteristics of uplifted
patches that introduce regressions. Through statistical and manual analyses, we
quantitatively and qualitatively investigate the reasons behind patch uplift
decisions and the characteristics of uplifted patches that introduced
regressions. Additionally, we interviewed three Mozilla release managers to
understand organizational factors that affect patch uplift decisions and
outcomes. Results show that most patches are uplifted because of a wrong
functionality or a crash. Uplifted patches that lead to faults tend to have
larger patch size, and most of the faults are due to semantic or memory errors
in the patches. Also, release managers are more inclined to accept patch uplift
requests that concern certain specific components, and-or that are submitted by
certain specific developers.Comment: In proceedings of the 33rd International Conference on Software
Maintenance and Evolution (ICSME 2017
High performance computation of landscape genomic models integrating local indices of spatial association
Since its introduction, landscape genomics has developed quickly with the
increasing availability of both molecular and topo-climatic data. The current
challenges of the field mainly involve processing large numbers of models and
disentangling selection from demography. Several methods address the latter,
either by estimating a neutral model from population structure or by inferring
simultaneously environmental and demographic effects. Here we present
Samada, an integrated approach to study signatures of local adaptation,
providing rapid processing of whole genome data and enabling assessment of
spatial association using molecular markers. Specifically, candidate loci to
adaptation are identified by automatically assessing genome-environment
associations. In complement, measuring the Local Indicators of Spatial
Association (LISA) for these candidate loci allows to detect whether similar
genotypes tend to gather in space, which constitutes a useful indication of the
possible kinship relationship between individuals. In this paper, we also
analyze SNP data from Ugandan cattle to detect signatures of local adaptation
with Samada, BayEnv, LFMM and an outlier method (FDIST approach in
Arlequin) and compare their results. Samada is an open source software
for Windows, Linux and MacOS X available at \url{http://lasig.epfl.ch/sambada}Comment: 1 figure in text, 1 figure in supplementary material The structure of
the article was modified and some explanations were updated. The methods and
results presented are the same as in the previous versio
Towards Automated Performance Bug Identification in Python
Context: Software performance is a critical non-functional requirement,
appearing in many fields such as mission critical applications, financial, and
real time systems. In this work we focused on early detection of performance
bugs; our software under study was a real time system used in the
advertisement/marketing domain.
Goal: Find a simple and easy to implement solution, predicting performance
bugs.
Method: We built several models using four machine learning methods, commonly
used for defect prediction: C4.5 Decision Trees, Na\"{\i}ve Bayes, Bayesian
Networks, and Logistic Regression.
Results: Our empirical results show that a C4.5 model, using lines of code
changed, file's age and size as explanatory variables, can be used to predict
performance bugs (recall=0.73, accuracy=0.85, and precision=0.96). We show that
reducing the number of changes delivered on a commit, can decrease the chance
of performance bug injection.
Conclusions: We believe that our approach can help practitioners to eliminate
performance bugs early in the development cycle. Our results are also of
interest to theoreticians, establishing a link between functional bugs and
(non-functional) performance bugs, and explicitly showing that attributes used
for prediction of functional bugs can be used for prediction of performance
bugs
SoC regression strategy developement
Abstract. The objective of the verifcation process of hardware is ensuring that the design does not contain any functional errors. Verifying the correct functionality of a large System-on-Chip (SoC) is a co-design process that is performed by running immature software on immature hardware. Among the key objectives is to ensure the completion of the design before proceeding to fabrication.
Verification is performed using a mix of software simulations that imitate the hardware functions and emulations executed on reconfigurable hardware. Both techniques are time-consuming, the software running perhaps at a billionth and the emulation at thousands of times slower than the targeted system. A good verification strategy reduces the time to market without compromising the testing coverage.
This thesis compares regression verification strategies for a large SoC project. These include different techniques of test case selection, test case prioritization that have been researched in software projects.
There is no single strategy that performs well in SoC throughout the whole development cycle. In the early stages of development time based test case prioritization provides the fastest convergence. Later history based test case prioritization and risk based test case selection gave a good balance between coverage, error detection, execution time, and foundations to predict the time to completion
Automated System Performance Testing at MongoDB
Distributed Systems Infrastructure (DSI) is MongoDB's framework for running
fully automated system performance tests in our Continuous Integration (CI)
environment. To run in CI it needs to automate everything end-to-end:
provisioning and deploying multi-node clusters, executing tests, tuning the
system for repeatable results, and collecting and analyzing the results. Today
DSI is MongoDB's most used and most useful performance testing tool. It runs
almost 200 different benchmarks in daily CI, and we also use it for manual
performance investigations. As we can alert the responsible engineer in a
timely fashion, all but one of the major regressions were fixed before the
4.2.0 release. We are also able to catch net new improvements, of which DSI
caught 17. We open sourced DSI in March 2020.Comment: Author Preprint. Appearing in DBTest.io 202
Hedonic Price Indexes for Personal Computer Operating Systems and Productivity Suites
Results of hedonic price regressions for personal computer operating systems and productivity suites advertised in PC World magazine by retail vendors during the time period 1984 to 2000 are reported. Among the quality attribute variables we use are new measures capturing the presence of network effects in personal computer operating systems, such as connectivity and compatibility, and product integration among components of productivity suites. Average annual growth rates of quality-adjusted prices of personal computer operating systems range from -15 to -18 percent, while those for productivity suites generally range between -13 and -16 percent. Price declines are generally greater in the latter half of the samples.
An Empirical Evaluation of the Indicators for Performance Regression Test Selection
As a software application is developed and maintained, changes to the source code may cause unintentional slowdowns in functionality. These slowdowns are known as performance regressions. Projects which are concerned about performance oftentimes create performance regression tests, which can be run to detect performance regressions. Ideally we would run these tests on every commit, however, these tests usually need a large amount of time or resources in order to simulate realistic scenarios.
The paper entitled Perphecy: Performance Regression Test Selection Made Simple but Effective presents a technique to solve this problem by attempting to predict the likelihood that a commit will cause a performance regression. They use static and dynamic analysis to gather several metrics for their prediction, and then they evaluate those metrics on several projects. This thesis seeks to replicate and expand on their work.
This thesis aims in revisiting the above-mentioned research paper by replicating its experiments and extending it by including a larger set of code changes to better understand how several metrics can be combined to approximate a better prediction of any code change that may potentially introduce deterioration at the performance of the software execution.
This thesis has successfully replicated the existing study along with generating more insights related to the approach, and provides an open-source tool that can help developers with detecting any performance regression within code changes as software evolves
Multivariate adaptive regression splines for estimating riverine constituent concentrations
Regression-based methods are commonly used for riverine constituent concentration/flux estimation, which is essential for guiding water quality protection practices and environmental decision making. This paper developed a multivariate adaptive regression splines model for estimating riverine constituent concentrations (MARS-EC). The process, interpretability and flexibility of the MARS-EC modelling approach, was demonstrated for total nitrogen in the Patuxent River, a major river input to Chesapeake Bay. Model accuracy and uncertainty of the MARS-EC approach was further analysed using nitrate plus nitrite datasets from eight tributary rivers to Chesapeake Bay. Results showed that the MARS-EC approach integrated the advantages of both parametric and nonparametric regression methods, and model accuracy was demonstrated to be superior to the traditionally used ESTIMATOR model. MARS-EC is flexible and allows consideration of auxiliary variables; the variables and interactions can be selected automatically. MARS-EC does not constrain concentration-predictor curves to be constant but rather is able to identify shifts in these curves from mathematical expressions and visual graphics. The MARS-EC approach provides an effective and complementary tool along with existing approaches for estimating riverine constituent concentrations
- …