96 research outputs found
JUGE: An Infrastructure for Benchmarking Java Unit Test Generators
Researchers and practitioners have designed and implemented various automated
test case generators to support effective software testing. Such generators
exist for various languages (e.g., Java, C#, or Python) and for various
platforms (e.g., desktop, web, or mobile applications). Such generators exhibit
varying effectiveness and efficiency, depending on the testing goals they aim
to satisfy (e.g., unit-testing of libraries vs. system-testing of entire
applications) and the underlying techniques they implement. In this context,
practitioners need to be able to compare different generators to identify the
most suited one for their requirements, while researchers seek to identify
future research directions. This can be achieved through the systematic
execution of large-scale evaluations of different generators. However, the
execution of such empirical evaluations is not trivial and requires a
substantial effort to collect benchmarks, setup the evaluation infrastructure,
and collect and analyse the results. In this paper, we present our JUnit
Generation benchmarking infrastructure (JUGE) supporting generators (e.g.,
search-based, random-based, symbolic execution, etc.) seeking to automate the
production of unit tests for various purposes (e.g., validation, regression
testing, fault localization, etc.). The primary goal is to reduce the overall
effort, ease the comparison of several generators, and enhance the knowledge
transfer between academia and industry by standardizing the evaluation and
comparison process. Since 2013, eight editions of a unit testing tool
competition, co-located with the Search-Based Software Testing Workshop, have
taken place and used and updated JUGE. As a result, an increasing amount of
tools (over ten) from both academia and industry have been evaluated on JUGE,
matured over the years, and allowed the identification of future research
directions
EvoSuite at the SBST 2016 Tool Competition
EvoSuite is a search-based tool that automatically generates unit tests for Java code. This paper summarizes the results and experiences of EvoSuite's participation at the fourth unit testing competition at SBST 2016, where Evo-Suite achieved the highest overall score
Private API Access and Functional Mocking in Automated Unit Test Generation
Not all object oriented code is easily testable: Dependency objects might be difficult or even impossible to instantiate, and object-oriented encapsulation makes testing potentially simple code difficult if it cannot easily be accessed. When this happens, then developers can resort to mock objects that simulate the complex dependencies, or circumvent object-oriented encapsulation and access private APIs directly through the use of, for example, Java reflection. Can automated unit test generation benefit from these techniques as well? In this paper we investigate this question by extending the EvoSuite unit test generation tool with the ability to directly access private APIs and to create mock objects using the popular Mockito framework. However, care needs to be taken that this does not impact the usefulness of the generated tests: For example, a test accessing a private field could later fail if that field is renamed, even if that renaming is part of a semantics-preserving refactoring. Such a failure would not be revealing a true regression bug, but is a false positive, which wastes the developer's time for investigating and fixing the test. Our experiments on the SF110 and Defects4J benchmarks confirm the anticipated improvements in terms of code coverage and bug finding, but also confirm the existence of false positives. However, by ensuring the test generator only uses mocking and reflection if there is no other way to reach some part of the code, their number remains small
Basic Block Coverage for Unit Test Generation at the SBST 2022 Tool Competition
Basic Block Coverage (BBC) is a secondary objective for search-based unit test generation techniques relying on the approach level and branch distance to drive the search process. Unlike the approach level and branch distance, which considers only information related to the coverage of explicit branches coming from conditional and loop statements, BBC also takes into account implicit branchings (e.g., a runtime exception thrown in a branchless method) denoted by the coverage level of relevant basic blocks in a control flow graph to drive the search process. Our implementation of BBC for unit test generation relies on the DynaMOSA algorithm and EvoSuite. This paper summarizes the results achieved by EvoSuite's DynaMOSA implementation with BBC as a secondary objective at the SBST 2022 unit testing tool competition
June: A Type Testability Transformation for Improved ATG Performance
Strings are universal containers: they are flexible to use, abundant in code, and difficult to test. String-controlled programs are programs that make branching decisions based on string input. Automatically generating valid test inputs for these programs considering only character sequences rather than any underlying string-encoded structures, can be prohibitively expensive. We present June, a tool that enables Java developers to expose any present latent string structure to test generation tools. June is an annotation-driven testability transformation and an extensible library, JuneLib, of structured string definitions. The core JuneLib definitions are empirically derived and provide templates for all structured strings in our test set. June takes lightly annotated source code and injects code that permits an automated test generator (ATG) to focus on the creation of mutable substrings inside a structured string. Using June costs the developer little, with an average of 2.1 annotations per string-controlled class. June uses standard Java build tools and therefore deploys seamlessly within a Java project. By feeding string structure information to an ATG tool, June dramatically reduces wasted effort; branches are effortlessly covered that would otherwise be extremely difficult, or impossible, to cover. This waste reduction both increases and speeds coverage. EvoSuite, for example, achieves the same coverage on June-ed classes in 1 minute, on average, as it does in 9 minutes on the un-June-ed class. These gains increase over time. On our corpus, June-ing a program compresses 24 hours of execution time into ca. 2 hours. We show that many ATG tools can reuse the same June-ed code: a few June annotations, a one-off cost, benefit many different testing regimes
Unit Test Generation During Software Development: EvoSuite Plugins for Maven, IntelliJ and Jenkins
Different techniques to automatically generate unit
tests for object oriented classes have been proposed, but how
to integrate these tools into the daily activities of software
development is a little investigated question. In this paper, we
report on our experience in supporting industrial partners in
introducing the EVOSUITE automated JUnit test generation tool
in their software development processes. The first step consisted
of providing a plugin to the Apache Maven build infrastructure.
The move from a research-oriented point-and-click tool to an
automated step of the build process has implications on how
developers interact with the tool and generated tests, and therefore,
we produced a plugin for the popular IntelliJ Integrated
Development Environment (IDE). As build automation is a core
component of Continuous Integration (CI), we provide a further
plugin to the Jenkins CI system, which allows developers to monitor
the results of EVOSUITE and integrate generated tests in their
source tree. In this paper, we discuss the resulting architecture of
the plugins, and the challenges arising when building such plugins.
Although the plugins described are targeted for the EVOSUITE
tool, they can be adapted and their architecture can be reused
for other test generation tools as well
Automatic generation of smell-free unit tests
Tese de mestrado, Engenharia Informática, 2022, Universidade de Lisboa, Faculdade de CiênciasAutomated test generation tools (such as EvoSuite) typically aim to maximize code
coverage. However, they frequently disregard non-coverage aspects that can be relevant
for testers, such as the quality of the generated tests. Therefore, automatically generated
tests are often affected by a set of test-specific bad programming practices that may hinder
the quality of both test and production code, i.e., test smells. Given that other researchers
have successfully integrated non-coverage quality metrics into EvoSuite, we decided to
extend the EvoSuite tool such that the generated test code is smell-free. To this aim, we
compiled 54 test smells from several sources and selected 16 smells that are relevant to the
context of this work. We then augmented the tool with the respective test smell metrics
and investigated the diffusion of the selected smells and the distribution of the metrics.
Finally, we implemented an approach to optimize the test smell metrics as secondary
criteria. After establishing the optimal configuration to optimize as secondary criteria
(which we used throughout the remainder of the study), we conducted an empirical study
to assess whether the tests became significantly less smelly. Furthermore, we studied
how the proposed metrics affect the fault detection effectiveness, coverage, and size of
the generated tests. Our study revealed that the proposed approach reduces the overall
smelliness of the generated tests; in particular, the diffusion of the “Indirect Testing” and
“Unrelated Assertions” smells improved considerably. Moreover, our approach improved
the smelliness of the tests generated by EvoSuite without compromising the code coverage
or fault detection effectiveness. The size and length of the generated tests were also not
affected by the new secondary criteria
- …