7 research outputs found

    Performance regression testing of concurrent classes

    Full text link
    Developers of thread-safe classes struggle with two oppos-ing goals. The class must be correct, which requires syn-chronizing concurrent accesses, and the class should pro-vide reasonable performance, which is difficult to realize in the presence of unnecessary synchronization. Validating the performance of a thread-safe class is challenging because it requires diverse workloads that use the class, because ex-isting performance analysis techniques focus on individual bottleneck methods, and because reliably measuring the per-formance of concurrent executions is difficult. This paper presents SpeedGun, an automatic performance regression testing technique for thread-safe classes. The key idea is to generate multi-threaded performance tests and to com-pare two versions of a class with each other. The analysis notifies developers when changing a thread-safe class signif-icantly influences the performance of clients of this class. An evaluation with 113 pairs of classes from popular Java projects shows that the analysis effectively identifies 13 per-formance differences, including performance regressions that the respective developers were not aware of

    Applying test case prioritization to software microbenchmarks

    Full text link
    Regression testing comprises techniques which are applied during software evolution to uncover faults effectively and efficiently. While regression testing is widely studied for functional tests, performance regression testing, e.g., with software microbenchmarks, is hardly investigated. Applying test case prioritization (TCP), a regression testing technique, to software microbenchmarks may help capturing large performance regressions sooner upon new versions. This may especially be beneficial for microbenchmark suites, because they take considerably longer to execute than unit test suites. However, it is unclear whether traditional unit testing TCP techniques work equally well for software microbenchmarks. In this paper, we empirically study coverage-based TCP techniques, employing total and additional greedy strategies, applied to software microbenchmarks along multiple parameterization dimensions, leading to 54 unique technique instantiations. We find that TCP techniques have a mean APFD-P (average percentage of fault-detection on performance) effectiveness between 0.54 and 0.71 and are able to capture the three largest performance changes after executing 29% to 66% of the whole microbenchmark suite. Our efficiency analysis reveals that the runtime overhead of TCP varies considerably depending on the exact parameterization. The most effective technique has an overhead of 11% of the total microbenchmark suite execution time, making TCP a viable option for performance regression testing. The results demonstrate that the total strategy is superior to the additional strategy. Finally, dynamic-coverage techniques should be favored over static-coverage techniques due to their acceptable analysis overhead; however, in settings where the time for prioritzation is limited, static-coverage techniques provide an attractive alternative

    Extending Peass to Detect Performance Changes of Apache Tomcat

    Get PDF
    New application versions may contain source code changes that decrease the application’s performance. To ensure sufficient performance, it is necessary to identify these code changes. Peass is a performance analysis tool using performance measurements of unit tests to achieve that goal for Java applications. However, it can only be utilized for Java applications that are built using the tools Apache Maven or Gradle. This thesis provides a plugin for Peass that enables it to analyze applications built with Apache Ant. Peass utilizes the frameworks Kieker and KoPeMe to record the execution traces and measure the response times of unit tests. This results in the following tasks for the Peass-Ant plugin: (1) Add Kieker and KoPeMe as dependencies and (2) Execute transformed unit tests. For the first task, our plugin programmatically resolves the transitive dependencies of Kieker and KoPeMe and modifies the XML buildfiles of the application under test. For the second task, the plugin orchestrates the process that surrounds test execution—implementing performance optimizations for the analysis of applications with large codebases—and executes specific Ant commands that prepare and start test execution. To make our plugin work, we additionally improved Peass and Kieker. Therefore, we implemented three enhancements and identified twelve bugs. We evaluated the Peass-Ant plugin by conducting a case study on 200 commits of the open-source project Apache Tomcat. We detected 14 commits with 57 unit tests that contain performance changes. Our subsequent root cause analysis identified nine source code changes that we assigned to three clusters of source code changes known to cause performance changes.:1. Introduction 1.1. Motivation 1.2. Objectives 1.3. Organization 2. Foundations 2.1. Performance Measurement in Java 2.2. Peass 2.3. Apache Ant 2.4. Apache Tomcat 3. Architecture of the Plugin 3.1. Requirements 3.2. Component Structure 3.3. Integrated Class Structure of Peass and the Plugin 3.4. Build Modification Tasks for Tomcat 4. Implementation 4.1. Changes in Peass 4.2. Changes in Kieker and Kieker-Source-Instrumentation 4.3. Buildfile Modification of the Plugin 4.4. Test Execution of the Plugin 5. Evaluative Case Study 5.1. Setup of the Case Study 5.2. Results of the Case Study 5.3. Performance Optimizations for Ant Applications 6. Related Work 6.1. Performance Analysis Tools 6.2. Test Selection and Test Prioritization Tools 6.3. Empirical Studies on Performance Bugs and Regressions 7. Conclusion and Future Work 7.1. Conclusion 7.2. Future WorkNeue Versionen einer Applikation können Quelltextänderungen enthalten, die die Performance der Applikation verschlechtern. Um eine ausreichende Performance sicherzustellen, ist es notwendig, diese Quelltextänderungen zu identifizieren. Peass ist ein Performance-Analyse-Tool, das die Performance von Unit-Tests misst, um dieses Ziel für Java-Applikationen zu erreichen. Allerdings kann es nur für Java-Applikationen verwendet werden, die eines der Build-Tools Apache Maven oder Gradle nutzen. In dieser Arbeit wird ein Plugin für Peass entwickelt, das es ermöglicht, mit Peass Applikationen zu analysieren, die das Build-Tool Apache Ant nutzen. Peass verwendet die Frameworks Kieker und KoPeMe, um Ausführungs-Traces von Unit-Tests aufzuzeichnen und Antwortzeiten von Unit-Tests zu messen. Daraus resultieren folgende Aufgaben für das Peass-Ant-Plugin: (1) Kieker und KoPeMe als Abhängigkeiten hinzufügen und (2) Transformierte Unit-Tests ausführen. Für die erste Aufgabe löst das Plugin programmbasiert die transitiven Abhängigkeiten von Kieker und KoPeMe auf und modifiziert die XML-Build-Dateien der zu testenden Applikation. Für die zweite Aufgabe steuert das Plugin den Prozess, der die Testausführung umgibt, und führt spezielle Ant-Kommandos aus, die die Testausführung vorbereiten und starten. Dabei implementiert es Performanceoptimierungen, um auch Applikationen mit einer großen Codebasis analysieren zu können. Um die Lauffähigkeit des Plugins sicherzustellen, wurden zusätzlich Verbesserungen an Peass und Kieker vorgenommen. Dabei wurden drei Erweiterungen implementiert und zwölf Bugs identifiziert. Um das Peass-Ant-Plugin zu bewerten, wurde eine Fallstudie mit 200 Commits des Open-Source-Projekts Apache Tomcat durchgeführt. Dabei wurden 14 Commits mit 57 Unit-Tests erkannt, die Performanceänderungen enthalten. Unsere anschließende Ursachenanalyse identifizierte neun verursachende Quelltextänderungen. Diese wurden drei Clustern von Quelltextänderungen zugeordnet, von denen bekannt ist, dass sie eine Veränderung der Performance verursachen.:1. Introduction 1.1. Motivation 1.2. Objectives 1.3. Organization 2. Foundations 2.1. Performance Measurement in Java 2.2. Peass 2.3. Apache Ant 2.4. Apache Tomcat 3. Architecture of the Plugin 3.1. Requirements 3.2. Component Structure 3.3. Integrated Class Structure of Peass and the Plugin 3.4. Build Modification Tasks for Tomcat 4. Implementation 4.1. Changes in Peass 4.2. Changes in Kieker and Kieker-Source-Instrumentation 4.3. Buildfile Modification of the Plugin 4.4. Test Execution of the Plugin 5. Evaluative Case Study 5.1. Setup of the Case Study 5.2. Results of the Case Study 5.3. Performance Optimizations for Ant Applications 6. Related Work 6.1. Performance Analysis Tools 6.2. Test Selection and Test Prioritization Tools 6.3. Empirical Studies on Performance Bugs and Regressions 7. Conclusion and Future Work 7.1. Conclusion 7.2. Future Wor

    Applying test case prioritization to software microbenchmarks

    Get PDF
    Regression testing comprises techniques which are applied during software evolution to uncover faults effectively and efficiently. While regression testing is widely studied for functional tests, performance regression testing, e.g., with software microbenchmarks, is hardly investigated. Applying test case prioritization (TCP), a regression testing technique, to software microbenchmarks may help capturing large performance regressions sooner upon new versions. This may especially be beneficial for microbenchmark suites, because they take considerably longer to execute than unit test suites. However, it is unclear whether traditional unit testing TCP techniques work equally well for software microbenchmarks. In this paper, we empirically study coverage-based TCP techniques, employing total and additional greedy strategies, applied to software microbenchmarks along multiple parameterization dimensions, leading to 54 unique technique instantiations. We find that TCP techniques have a mean APFD-P (average percentage of fault-detection on performance) effectiveness between 0.54 and 0.71 and are able to capture the three largest performance changes after executing 29% to 66% of the whole microbenchmark suite. Our efficiency analysis reveals that the runtime overhead of TCP varies considerably depending on the exact parameterization. The most effective technique has an overhead of 11% of the total microbenchmark suite execution time, making TCP a viable option for performance regression testing. The results demonstrate that the total strategy is superior to the additional strategy. Finally, dynamic-coverage techniques should be favored over static-coverage techniques due to their acceptable analysis overhead; however, in settings where the time for prioritzation is limited, static-coverage techniques provide an attractive alternative

    CONFPROFITT: A CONFIGURATION-AWARE PERFORMANCE PROFILING, TESTING, AND TUNING FRAMEWORK

    Get PDF
    Modern computer software systems are complicated. Developers can change the behavior of the software system through software configurations. The large number of configuration option and their interactions make the task of software tuning, testing, and debugging very challenging. Performance is one of the key aspects of non-functional qualities, where performance bugs can cause significant performance degradation and lead to poor user experience. However, performance bugs are difficult to expose, primarily because detecting them requires specific inputs, as well as specific configurations. While researchers have developed techniques to analyze, quantify, detect, and fix performance bugs, many of these techniques are not effective in highly-configurable systems. To improve the non-functional qualities of configurable software systems, testing engineers need to be able to understand the performance influence of configuration options, adjust the performance of a system under different configurations, and detect configuration-related performance bugs. This research will provide an automated framework that allows engineers to effectively analyze performance-influence configuration options, detect performance bugs in highly-configurable software systems, and adjust configuration options to achieve higher long-term performance gains. To understand real-world performance bugs in highly-configurable software systems, we first perform a performance bug characteristics study from three large-scale opensource projects. Many researchers have studied the characteristics of performance bugs from the bug report but few have reported what the experience is when trying to replicate confirmed performance bugs from the perspective of non-domain experts such as researchers. This study is meant to report the challenges and potential workaround to replicate confirmed performance bugs. We also want to share a performance benchmark to provide real-world performance bugs to evaluate future performance testing techniques. Inspired by our performance bug study, we propose a performance profiling approach that can help developers to understand how configuration options and their interactions can influence the performance of a system. The approach uses a combination of dynamic analysis and machine learning techniques, together with configuration sampling techniques, to profile the program execution, analyze configuration options relevant to performance. Next, the framework leverages natural language processing and information retrieval techniques to automatically generate test inputs and configurations to expose performance bugs. Finally, the framework combines reinforcement learning and dynamic state reduction techniques to guide subject application towards achieving higher long-term performance gains

    Performance Problem Diagnostics by Systematic Experimentation

    Get PDF
    Diagnostics of performance problems requires deep expertise in performance engineering and entails a high manual effort. As a consequence, performance evaluations are postponed to the last minute of the development process. In this thesis, we introduce an automatic, experiment-based approach for performance problem diagnostics in enterprise software systems. With this approach, performance engineers can concentrate on their core competences instead of conducting repeating tasks

    Performance Problem Diagnostics by Systematic Experimentation

    Get PDF
    In this book, we introduce an automatic, experiment-based approach for performance problem diagnostics in enterprise software systems. The proposed approach systematically searches for root causes of detected performance problems by executing series of systematic performance tests. The presented approach is evaluated by various case studies showing that the presented approach is applicable to a wide range of contexts
    corecore