Search CORE

4 research outputs found

Duet Benchmarking: Improving Measurement Accuracy in the Cloud

Author: Bulej Lubomír
Farquet François
Horký Vojtěch
Prokopec Aleksandar
Tůma Petr
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 17/01/2020
Field of study

We investigate the duet measurement procedure, which helps improve the accuracy of performance comparison experiments conducted on shared machines by executing the measured artifacts in parallel and evaluating their relative performance together, rather than individually. Specifically, we analyze the behavior of the procedure in multiple cloud environments and use experimental evidence to answer multiple research questions concerning the assumption underlying the procedure. We demonstrate improvements in accuracy ranging from 2.3x to 12.5x (5.03x on average) for the tested ScalaBench (and DaCapo) workloads, and from 23.8x to 82.4x (37.4x on average) for the SPEC CPU 2017 workloads

arXiv.org e-Print Archive

Crossref

Machine learning aplicado al análisis del rendimiento de desarrollos de software

Author: Gil-Vera Victor Daniel
Seguro-Gallego Cristian
Publication venue: 'Politecnico Colombiano Jaime Isaza Cadavid'
Publication date: 29/04/2022
Field of study

Performance tests are crucial to measure the quality of software developments, since they allow identifying aspects to be improved in order to achieve customer satisfaction. The objective of this research was to identify the optimal Machine Learning technique to predict whether or not a software development meets the customer's acceptance criteria. A dataset with information obtained from web services performance tests and the F1-score quality metric were used. This paper concludes that, although the Random Forest technique obtained the best score, it is not correct to state that it is the best Machine Learning technique; the quantity and quality of the data used in the training play a very important role, as well as an adequate processing of the information.Las pruebas de rendimiento son determinantes para medir la calidad de los desarrollos de software, ya que permiten identificar aspectos que se deben mejorar en pro de alcanzar la satisfacción del cliente. El objetivo de este trabajo fue identificar la técnica óptima de Machine Learning para predecir si un desarrollo de software cumple o no con los criterios de aceptación del cliente. Se empleó una base de datos de información obtenida en pruebas de rendimiento a servicios web y la métrica de calidad F1-score. Se concluye que, a pesar de que la técnica de Random Forest obtuvo el mejor puntaje, no es correcto afirmar que sea la mejor técnica de Machine Learning; la cantidad y la calidad de los datos empleados en el entrenamiento desempeñan un papel de gran importancia, al igual que un procesamiento adecuado de la información. Performance tests are crucial to measure the quality of software developments, since they allow identifying aspects to be improved in order to achieve customer satisfaction. The objective of this research was to identify the optimal Machine Learning technique to predict whether or not a software development meets the customer's acceptance criteria. A dataset with information obtained from web services performance tests and the F1-score quality metric were used. This paper concludes that, although the Random Forest technique obtained the best score, it is not correct to state that it is the best Machine Learning technique; the quantity and quality of the data used in the training play a very important role, as well as an adequate processing of the information.Os testes de desempenho são cruciais para medir a qualidade dos desenvolvimentos de software, pois permitem identificar aspectos que precisam de ser melhorados a fim de alcançar a satisfação do cliente. O objectivo deste trabalho era identificar a técnica óptima de Machine Learning para prever se um desenvolvimento de software satisfaz ou não os critérios de aceitação do cliente. Foi utilizada uma base de dados de informações obtidas a partir de testes de desempenho de serviços web e a métrica de qualidade F1-score. Conclui-se que, embora a técnica da Random Forest tenha obtido a melhor pontuação, não é correcto dizer que é a melhor técnica de Aprendizagem Automática; a quantidade e qualidade dos dados utilizados na formação desempenham um papel muito importante, bem como um processamento adequado da informação. Traduzido com a versão gratuita do tradutor - www.DeepL.com/Translato

Revistas Politécnico Colombiano Jaime Isaza Cadavid

Applying test case prioritization to software microbenchmarks

Author: Gall Harald C.
Laaber Christoph
Leitner Philipp
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Regression testing comprises techniques which are applied during software evolution to uncover faults effectively and efficiently. While regression testing is widely studied for functional tests, performance regression testing, e.g., with software microbenchmarks, is hardly investigated. Applying test case prioritization (TCP), a regression testing technique, to software microbenchmarks may help capturing large performance regressions sooner upon new versions. This may especially be beneficial for microbenchmark suites, because they take considerably longer to execute than unit test suites. However, it is unclear whether traditional unit testing TCP techniques work equally well for software microbenchmarks. In this paper, we empirically study coverage-based TCP techniques, employing total and additional greedy strategies, applied to software microbenchmarks along multiple parameterization dimensions, leading to 54 unique technique instantiations. We find that TCP techniques have a mean APFD-P (average percentage of fault-detection on performance) effectiveness between 0.54 and 0.71 and are able to capture the three largest performance changes after executing 29% to 66% of the whole microbenchmark suite. Our efficiency analysis reveals that the runtime overhead of TCP varies considerably depending on the exact parameterization. The most effective technique has an overhead of 11% of the total microbenchmark suite execution time, making TCP a viable option for performance regression testing. The results demonstrate that the total strategy is superior to the additional strategy. Finally, dynamic-coverage techniques should be favored over static-coverage techniques due to their acceptable analysis overhead; however, in settings where the time for prioritzation is limited, static-coverage techniques provide an attractive alternative

Chalmers Research