287,717 research outputs found

    Does Automated Unit Test Generation Really Help Software Testers? A Controlled Empirical Study

    Get PDF
    Work on automated test generation has produced several tools capable of generating test data which achieves high structural coverage over a program. In the absence of a specification, developers are expected to manually construct or verify the test oracle for each test input. Nevertheless, it is assumed that these generated tests ease the task of testing for the developer, as testing is reduced to checking the results of tests. While this assumption has persisted for decades, there has been no conclusive evidence to date confirming it. However, the limited adoption in industry indicates this assumption may not be correct, and calls into question the practical value of test generation tools. To investigate this issue, we performed two controlled experiments comparing a total of 97 subjects split between writing tests manually and writing tests with the aid of an automated unit test generation tool, EvoSuite. We found that, on one hand, tool support leads to clear improvements in commonly applied quality metrics such as code coverage (up to 300% increase). However, on the other hand, there was no measurable improvement in the number of bugs actually found by developers. Our results not only cast some doubt on how the research community evaluates test generation tools, but also point to improvements and future work necessary before automated test generation tools will be widely adopted by practitioners

    Automated testing for intelligent agent systems

    Get PDF
    This paper describes an approach to unit testing of plan based agent systems, with a focus on automated generation and execution of test cases. Design artefacts, supplemented with some additional data, provide the basis for specification of a comprehensive suite of test cases. Correctness of execution is evaluated against a design model, and a comprehensive report of errors and warnings is provided to the user. Given that it is impossible to design test suites which execute all possible traces of an agent program, it is extremely important to thoroughly test all units in as wide a variety of situations as possible to ensure acceptable behaviour. We provide details of the information required in design models or related data to enable the automated generation and execution of test cases. We also briefly describe the implemented tool which realises this approach

    Ant colony optimization for object-oriented unit test generation

    Get PDF
    Generating useful unit tests for object-oriented programs is difficult for traditional optimization methods. One not only needs to identify values to be used as inputs, but also synthesize a program which creates the required state in the program under test. Many existing Automated Test Generation (ATG) approaches combine search with performance-enhancing heuristics. We present Tiered Ant Colony Optimization (Taco) for generating unit tests for object-oriented programs. The algorithm is formed of three Tiers of ACO, each of which tackles a distinct task: goal prioritization, test program synthesis, and data generation for the synthesised program. Test program synthesis allows the creation of complex objects, and exploration of program state, which is the breakthrough that has allowed the successful application of ACO to object-oriented test generation. Taco brings the mature search ecosystem of ACO to bear on ATG for complex object-oriented programs, providing a viable alternative to current approaches. To demonstrate the effectiveness of Taco, we have developed a proof-of-concept tool which successfully generated tests for an average of 54% of the methods in 170 Java classes, a result competitive with industry standard Randoop

    Performance Metamorphic Testing: A Proof of Concept

    Get PDF
    Context. Performance testing is a challenging task mainly due to the lack of test oracles, i.e. mechanisms to decide whether the performance of a program is acceptable or not because of a bug. Metamorphic testing enables the generation of test cases in the absence of an oracle by exploiting the so–called metamorphic relations between the inputs and outputs of multiple executions of the program under test. In the last two decades, metamorphic testing has been successfully used to detect functional faults in di erent domains. However, its applicability to performance testing remains unexplored. Objective. We propose the application of metamorphic testing to reveal performance failures. Method. We define Performance Metamorphic Relations (PMRs) as expected relations between performance measurements of multiple executions of the program under test. These relations can be turned into assertions for the automated detection of performance bugs, removing the need for complex benchmarks and domain experts guidance. As a further benefit, PMRs can be turned into fitness functions to guide search–based techniques on the generation of test data. Results. The feasibility of the approach is illustrated through an experimental proof of concept in the context of the automated analysis of feature models. Conclusion. The results confirm the potential of metamorphic testing, in combination with search-based techniques, to automate the detection of performance bugs.Comisión Interministerial de Ciencia y Tecnología TIN2015-70560-RComisión Interministerial de Ciencia y Tecnología TIN2015-71841Junta de Andalucía P12-TIC-186

    Cross-platform verification framework for embedded systems

    Get PDF
    Many innovations in the automotive sector involve complex electronics and embedded software systems. Testing techniques are one of the key methodologies for detecting faults in such embedded systems.In this paper, a novel cross-platform verification framework including automated test-case generation by model checking is introduced. Comparing the execution behavior of a program instance running on a certain platform to the execution behavior of the same program running on a different platform we denote cross-platform verification. The framework supports various types of coverage criteria. It turned out that end-to-end testing is of high importance due to defects occurring on the actual target platform for the first time.Additionally, formal verification can be applied for checking requirements resulting from the specification using the same model generation mechanism that is used for test data generation. Due to a novel self-assessment mechanism, the confidence into the formal models is increased significantly.We provide a case study for the Motorola embedded controller HCS12 that is heavily used by the automotive industry. We perform structural tests on industrial code patterns using a wide-spread industrial compiler. Using our technique, we found two severe compiler defects that have been corrected in subsequent releases

    Mapping the Structure and Evolution of Software Testing Research Over the Past Three Decades

    Full text link
    Background: The field of software testing is growing and rapidly-evolving. Aims: Based on keywords assigned to publications, we seek to identify predominant research topics and understand how they are connected and have evolved. Method: We apply co-word analysis to map the topology of testing research as a network where author-assigned keywords are connected by edges indicating co-occurrence in publications. Keywords are clustered based on edge density and frequency of connection. We examine the most popular keywords, summarize clusters into high-level research topics, examine how topics connect, and examine how the field is changing. Results: Testing research can be divided into 16 high-level topics and 18 subtopics. Creation guidance, automated test generation, evolution and maintenance, and test oracles have particularly strong connections to other topics, highlighting their multidisciplinary nature. Emerging keywords relate to web and mobile apps, machine learning, energy consumption, automated program repair and test generation, while emerging connections have formed between web apps, test oracles, and machine learning with many topics. Random and requirements-based testing show potential decline. Conclusions: Our observations, advice, and map data offer a deeper understanding of the field and inspiration regarding challenges and connections to explore.Comment: To appear, Journal of Systems and Softwar

    Performance Metamorphic Testing: Motivation and Challenges

    Get PDF
    Performance testing is a challenging task mainly due to the lack of test oracles, that is, mechanisms to decide whether the performance of a program under a certain workload is either acceptable or poor due to a performance bug. Metamorphic testing enables the generation of test cases in the absence of an oracle by exploiting the relations (so–called metamorphic relations) between the inputs and outputs of multiple executions of the program under test. In the last two decades, metamorphic testing has been successfully used to detect functional faults in a variety of domains, ranging from web services to simulators. However, the applicability of metamorphic testing to detect performance bugs is a topic that remains unexplored. In this vision paper, we introduce Performance Metamorphic Relations (PMRs) as expected relations between the performance measurements of multiple executions of the program under test. We hypothesize that these relations can be turned into assertions for the automated detection of performance bugs removing the need for complex benchmarks and domain experts guidance. As a further benefit, PMRs can be turned into fitness functions to guide search-based techniques on the generation of test data that violate the relations, revealing bugs. This novel idea is motivated with examples and an overview of some of the challenges in this promising topic.Comisión Interministerial de Ciencia y Tecnología TIN2015-70560-RMinisterio de Economía, Industria y Competitividad TIN2015-71841-REDJunta de Andalucía P12-TIC-186

    Large Language Models are Few-shot Testers: Exploring LLM-based General Bug Reproduction

    Full text link
    Many automated test generation techniques have been developed to aid developers with writing tests. To facilitate full automation, most existing techniques aim to either increase coverage, or generate exploratory inputs. However, existing test generation techniques largely fall short of achieving more semantic objectives, such as generating tests to reproduce a given bug report. Reproducing bugs is nonetheless important, as our empirical study shows that the number of tests added in open source repositories due to issues was about 28% of the corresponding project test suite size. Meanwhile, due to the difficulties of transforming the expected program semantics in bug reports into test oracles, existing failure reproduction techniques tend to deal exclusively with program crashes, a small subset of all bug reports. To automate test generation from general bug reports, we propose LIBRO, a framework that uses Large Language Models (LLMs), which have been shown to be capable of performing code-related tasks. Since LLMs themselves cannot execute the target buggy code, we focus on post-processing steps that help us discern when LLMs are effective, and rank the produced tests according to their validity. Our evaluation of LIBRO shows that, on the widely studied Defects4J benchmark, LIBRO can generate failure reproducing test cases for 33% of all studied cases (251 out of 750), while suggesting a bug reproducing test in first place for 149 bugs. To mitigate data contamination, we also evaluate LIBRO against 31 bug reports submitted after the collection of the LLM training data terminated: LIBRO produces bug reproducing tests for 32% of the studied bug reports. Overall, our results show LIBRO has the potential to significantly enhance developer efficiency by automatically generating tests from bug reports.Comment: Accepted to IEEE/ACM International Conference on Software Engineering 2023 (ICSE 2023
    • …
    corecore