Search CORE

146 research outputs found

Uncertainty-Aware Performance Prediction for Highly Configurable Software Systems via Bayesian Neural Networks

Author: Fan Zongwen
Ha Huong
Zhang Hongyu
Publication venue
Publication date: 26/12/2022
Field of study

Configurable software systems are employed in many important application domains. Understanding the performance of the systems under all configurations is critical to prevent potential performance issues caused by misconfiguration. However, as the number of configurations can be prohibitively large, it is not possible to measure the system performance under all configurations. Thus, a common approach is to build a prediction model from a limited measurement data to predict the performance of all configurations as scalar values. However, it has been pointed out that there are different sources of uncertainty coming from the data collection or the modeling process, which can make the scalar predictions not certainly accurate. To address this problem, we propose a Bayesian deep learning based method, namely BDLPerf, that can incorporate uncertainty into the prediction model. BDLPerf can provide both scalar predictions for configurations' performance and the corresponding confidence intervals of these scalar predictions. We also develop a novel uncertainty calibration technique to ensure the reliability of the confidence intervals generated by a Bayesian prediction model. Finally, we suggest an efficient hyperparameter tuning technique so as to train the prediction model within a reasonable amount of time whilst achieving high accuracy. Our experimental results on 10 real-world systems show that BDLPerf achieves higher accuracy than existing approaches, in both scalar performance prediction and confidence interval estimation

arXiv.org e-Print Archive

Predicting Software Performance with Divide-and-Learn

Author: Chen Tao
Gong Jingzhi
Publication venue
Publication date: 17/08/2023
Field of study

Predicting the performance of highly configurable software systems is the foundation for performance testing and quality assurance. To that end, recent work has been relying on machine/deep learning to model software performance. However, a crucial yet unaddressed challenge is how to cater for the sparsity inherited from the configuration landscape: the influence of configuration options (features) and the distribution of data samples are highly sparse. In this paper, we propose an approach based on the concept of 'divide-and-learn', dubbed

DaL

. The basic idea is that, to handle sample sparsity, we divide the samples from the configuration landscape into distant divisions, for each of which we build a regularized Deep Neural Network as the local model to deal with the feature sparsity. A newly given configuration would then be assigned to the right model of division for the final prediction. Experiment results from eight real-world systems and five sets of training data reveal that, compared with the state-of-the-art approaches,

DaL

performs no worse than the best counterpart on 33 out of 40 cases (within which 26 cases are significantly better) with up to

1.94\times

improvement on accuracy; requires fewer samples to reach the same/better accuracy; and producing acceptable training overhead. Practically,

DaL

also considerably improves different global models when using them as the underlying local models, which further strengthens its flexibility. To promote open science, all the data, code, and supplementary figures of this work can be accessed at our repository: https://github.com/ideas-labo/DaL.Comment: This paper has been accepted by The ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), 202

arXiv.org e-Print Archive

Performance-Detective: Automatic Deduction of Cheap and Accurate Performance Models

Author: Calotoiu Alexandru
Copik Marcin
Hoefler Torsten
Koziolek Anne
Reiter Andreas
Schmid Larissa
Selzer Michael
Werle Dominik
Publication venue: Association for Computing Machinery
Publication date: 20/05/2022
Field of study

The many configuration options of modern applications make it difficult for users to select a performance-optimal configuration. Performance models help users in understanding system performance and choosing a fast configuration. Existing performance modeling approaches for applications and configurable systems either require a full-factorial experiment design or a sampling design based on heuristics. This results in high costs for achieving accurate models. Furthermore, they require repeated execution of experiments to account for measurement noise. We propose Performance-Detective, a novel code analysis tool that deduces insights on the interactions of program parameters. We use the insights to derive the smallest necessary experiment design and avoiding repetitions of measurements when possible, significantly lowering the cost of performance modeling. We evaluate Performance-Detective using two case studies where we reduce the number of measurements from up to 3125 to only 25, decreasing cost to only 2.9% of the previously needed core hours, while maintaining accuracy of the resulting model with 91.5% compared to 93.8% using all 3125 measurements

Repository for Publications and Research Data

KITopen

Improved Quantification of Important Beer Quality Parameters based on Non-linear Calibration Methods applied to FT-MIR Spectra

Author: Brandstetter M.
Cernuda C.
Forster C.
Klein H.
Lughofer E.
Pawliczek M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

During the production process of beer, it is of utmost importance to guarantee a high consistency of the beer quality. For instance, the bitterness is an essential quality parameter which has to be controlled within the specifications already at the beginning of the production process in the unfermented beer (wort) as well as in final products such as beer and beer mix beverages. Nowadays, analytical techniques for quality control in beer production are mainly based on manual supervision, i.e. samples are taken from the process and analyzed in the laboratory. This typically requires significant lab technicians efforts for only a small fraction of samples to be analyzed, which leads to significant costs for beer breweries and companies. Fourier transform mid-infrared (FT-MIR) spectroscopy was used in combination with non-linear multivariate calibration techniques to overcome (i) the time consuming off-line analyses in beer production and (ii) already known limitations of standard linear chemometric methods , like partial least squares (PLS), for important quality parameters [1][2] such as bitterness, citric acid, total acids, free amino nitrogen, final attenuation or foam stability. The calibration models are established with enhanced non-linear techniques based (i) on a new piece-wise linear version of PLS by employing fuzzy rules for local partitioning the latent variable space and (ii) on extensions of support vector regression variants (ε-PLSSVR and ν-PLSSVR), for overcoming high computation times in high-dimensional problems and time-intensive and inappropriate settings of the kernel parameters. Furthermore, we introduce a new model selection scheme based on bagged ensembles in order to improve robustness and thus predictive quality of the final models. The approaches are tested on real-world calibration data sets for wort and beer mix beverages, and successfully compared to linear methods, as showing a clear out-performance in most cases and being able to meet the model quality requirements defined by the experts at the beer company

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

BCAM's Institutional Repository Data

Machine-Learned Caching of Datasets

Author: Membrives Etienne J.
Publication venue: Technical Disclosure Commons
Publication date: 07/12/2017
Field of study

Generally, the present disclosure is directed to creating and/or modifying a pre-cache for a client device connected to a remote server containing a dataset. In particular, in some implementations, the systems and methods of the present disclosure can include or otherwise leverage one or more machine-learned models to predict the likelihood a particular piece of data will be used (e.g. opened, edited, saved, etc.) within a time frame based on information about the data, the user’s interaction with the data, and/or the user’s schedule

Technical Disclosure Common

Recommended from our members

Machine Learning for Performance Prediction of Data Distribution Service

Author: Peeroo K.
Popov P. T.
Stankovic V.
Weyde T.
Publication venue: City, University of London
Publication date: 31/10/2023
Field of study

Data Distribution Service (DDS) is a specification of networking middleware used in real-time mission-critical systems such as autonomous vehicles, energy management systems, and air traffic control. It follows the publish-subscribe communication patterns and adopts the use of Quality of Service (QoS) parameters, allowing customisation of the data dissemination process in real-time. When setting up DDS systems, practitioners must ensure the required performance levels are achievable by setting appropriate QoS and non-QoS parameters. The evaluation of performance levels can be done by running experimental performance tests for different QoS configurations to find a suitable or even a near-optimal system configuration. However, evaluation via measurements with real DDS systems can be complex and expensive, needing potentially substantial time and resources. This paper introduces, to our knowledge for the first time, the use of machine learning (ML) models to predict the performance of DDS under different system configurations. This is done by testing some system configurations and using the performance measurements to train a model. The trained model can then be used to predict the performance of DDS under other system configurations. Since the prediction is computationally inexpensive, we can predict the performance of many different configurations to find a suitable one for given requirements. As an ML method, random forests have been used in this paper and as a baseline we use a linear regression model. We selected six performance metrics and for each one we trained a random forests model and tuned its hyperparameters. We tested the final models on unseen system configurations in interpolating and extrapolating with respect to the system parameter values. The random forests models show strong predictive performance and are significantly better than linear regression. Five of the eleven random forests models have a coefficient of determination greater than 0.8 for unseen system configurations in the extrapolation setting. With these models it is possible to explore a much wider range of parameters than could be done with experimentation alone. We therefore believe that this approach can be beneficial for DDS system design

City Research Online

System Volume Compensating for Environmental Noise

Author: Felker Nick
Publication venue: Technical Disclosure Commons
Publication date: 07/12/2017
Field of study

Generally, the present disclosure is directed to an audio system for compensating for ambient environmental noise. In particular, in some implementations, the systems and methods of the present disclosure can include or otherwise leverage one or more machine-learned models to predict a comfortable volume level based on an intensity of ambient noise

Technical Disclosure Common