5 research outputs found
Perfect is the enemy of test oracle
Automation of test oracles is one of the most challenging facets of software
testing, but remains comparatively less addressed compared to automated test
input generation. Test oracles rely on a ground-truth that can distinguish
between the correct and buggy behavior to determine whether a test fails
(detects a bug) or passes. What makes the oracle problem challenging and
undecidable is the assumption that the ground-truth should know the exact
expected, correct, or buggy behavior. However, we argue that one can still
build an accurate oracle without knowing the exact correct or buggy behavior,
but how these two might differ. This paper presents SEER, a learning-based
approach that in the absence of test assertions or other types of oracle, can
determine whether a unit test passes or fails on a given method under test
(MUT). To build the ground-truth, SEER jointly embeds unit tests and the
implementation of MUTs into a unified vector space, in such a way that the
neural representation of tests are similar to that of MUTs they pass on them,
but dissimilar to MUTs they fail on them. The classifier built on top of this
vector representation serves as the oracle to generate "fail" labels, when test
inputs detect a bug in MUT or "pass" labels, otherwise. Our extensive
experiments on applying SEER to more than 5K unit tests from a diverse set of
open-source Java projects show that the produced oracle is (1) effective in
predicting the fail or pass labels, achieving an overall accuracy, precision,
recall, and F1 measure of 93%, 86%, 94%, and 90%, (2) generalizable, predicting
the labels for the unit test of projects that were not in training or
validation set with negligible performance drop, and (3) efficient, detecting
the existence of bugs in only 6.5 milliseconds on average.Comment: Published in ESEC/FSE 202
The Oracle Problem in Software Testing: A Survey
Testing involves examining the behaviour of a system in order to discover potential faults. Given an input for a system, the challenge of distinguishing the corresponding desired, correct behaviour from potentially incorrect behavior is called the “test oracle problem”. Test oracle automation is important to remove a current bottleneck that inhibits greater overall test automation. Without test oracle automation, the human has to determine whether observed behaviour is correct. The literature on test oracles has introduced techniques for oracle automation, including modelling, specifications, contract-driven development and metamorphic testing. When none of these is completely adequate, the final source of test oracle information remains the human, who may be aware of informal specifications, expectations, norms and domain specific information that provide informal oracle guidance. All forms of test oracles, even the humble human, involve challenges of reducing cost and increasing benefit. This paper provides a comprehensive survey of current approaches to the test oracle problem and an analysis of trends in this important area of software testing research and practice
Recommended from our members
Automatic Detection of Defects in Applications without Test Oracles
In application domains that do not have a test oracle, such as machine learning and scientific computing, quality assurance is a challenge because it is difficult or impossible to know in advance what the correct output should be for general input. Previously, metamorphic testing has been shown to be a simple yet effective technique in detecting defects, even without an oracle. In metamorphic testing, the application's ``metamorphic properties'' are used to modify existing test case input to produce new test cases in such a manner that, when given the new input, the new output can easily be computed based on the original output. If the new output is not as expected, then a defect must exist. In practice, however, metamorphic testing can be a manually intensive technique for all but the simplest cases. The transformation of input data can be laborious for large data sets, and errors can occur in comparing the outputs when they are very complex. In this paper, we present a tool called Amsterdam that automates metamorphic testing by allowing the tester to easily set up and conduct metamorphic tests with little manual intervention, merely by specifying the properties to check, configuring the framework, and running the software. Additionally, we describe an approach called Heuristic Metamorphic Testing, which addresses issues related to false positives and non-determinism, and we present the results of new empirical studies that demonstrate the effectiveness of metamorphic testing techniques at detecting defects in real-world programs without test oracles
Automated Software Testing of Relational Database Schemas
Relational databases are critical for many software systems, holding the most valuable data for organisations. Data engineers build relational databases using schemas to specify the structure of the data within a database and defining integrity constraints. These constraints protect the data's consistency and coherency, leading industry experts to recommend testing them.
Since manual schema testing is labour-intensive and error-prone, automated techniques enable the generation of test data. Although these generators are well-established and effective, they use default values and often produce many, long, and similar tests --- this results in decreasing fault detection and increasing regression testing time and testers inspection efforts. It raises the following questions: How effective is the optimised random generator at generating tests and its fault detection compared to prior methods? What factors make tests understandable for testers? How to reduce tests while maintaining effectiveness? How effectively do testers inspect differently reduced tests?
To answer these questions, the first contribution of this thesis is to evaluate a new optimised random generator against well-established methods empirically. Secondly, identifying understandability factors of schema tests using a human study. Thirdly, evaluating a novel approach that reduces and merge tests against traditional reduction methods. Finally, studying testers' inspection efforts with differently reduced tests using a human study.
The results show that the optimised random method efficiently generates effective tests compared to well-established methods. Testers reported that many NULLs and negative numbers are confusing, and they prefer simple repetition of unimportant values and readable strings. The reduction technique with merging is the most effective at minimising tests and producing efficient tests while maintaining effectiveness compared to traditional methods. The merged tests showed an increase in inspection efficiency with a slight accuracy decrease compared to only reduced tests. Therefore, these techniques and investigations can help practitioners adopt these generators in practice
Abstraction in Assertion-Based Test Oracles
about Programs — Assertions, invariants, pre- and post-conditions, specification techniques