3 research outputs found
A Testability Analysis Framework for Non-Functional Properties
This paper presents background, the basic steps and an example for a
testability analysis framework for non-functional properties
Boundary Value Exploration for Software Analysis
For software to be reliable and resilient, it is widely accepted that tests
must be created and maintained alongside the software itself. One safeguard
from vulnerabilities and failures in code is to ensure correct behavior on the
boundaries between sub-domains of the input space. So-called boundary value
analysis (BVA) and boundary value testing (BVT) techniques aim to exercise
those boundaries and increase test effectiveness. However, the concepts of BVA
and BVT themselves are not clearly defined and it is not clear how to identify
relevant sub-domains, and thus the boundaries delineating them, given a
specification. This has limited adoption and hindered automation. We clarify
BVA and BVT and introduce Boundary Value Exploration (BVE) to describe
techniques that support them by helping to detect and identify boundary inputs.
Additionally, we propose two concrete BVE techniques based on
information-theoretic distance functions: (i) an algorithm for boundary
detection and (ii) the usage of software visualization to explore the behavior
of the software under test and identify its boundary behavior. As an initial
evaluation, we apply these techniques on a much used and well-tested date
handling library. Our results reveal questionable behavior at boundaries
highlighted by our techniques. In conclusion, we argue that the boundary value
exploration that our techniques enable is a step towards automated boundary
value analysis and testing which can foster their wider use and improve test
effectiveness and efficiency
Guiding Deep Learning System Testing using Surprise Adequacy
Deep Learning (DL) systems are rapidly being adopted in safety and security
critical domains, urgently calling for ways to test their correctness and
robustness. Testing of DL systems has traditionally relied on manual collection
and labelling of data. Recently, a number of coverage criteria based on neuron
activation values have been proposed. These criteria essentially count the
number of neurons whose activation during the execution of a DL system
satisfied certain properties, such as being above predefined thresholds.
However, existing coverage criteria are not sufficiently fine grained to
capture subtle behaviours exhibited by DL systems. Moreover, evaluations have
focused on showing correlation between adversarial examples and proposed
criteria rather than evaluating and guiding their use for actual testing of DL
systems. We propose a novel test adequacy criterion for testing of DL systems,
called Surprise Adequacy for Deep Learning Systems (SADL), which is based on
the behaviour of DL systems with respect to their training data. We measure the
surprise of an input as the difference in DL system's behaviour between the
input and the training data (i.e., what was learnt during training), and
subsequently develop this as an adequacy criterion: a good test input should be
sufficiently but not overtly surprising compared to training data. Empirical
evaluation using a range of DL systems from simple image classifiers to
autonomous driving car platforms shows that systematic sampling of inputs based
on their surprise can improve classification accuracy of DL systems against
adversarial examples by up to 77.5% via retraining