With the increasing adoption of Deep Neural Network (DNN) models as integral
parts of software systems, efficient operational testing of DNNs is much in
demand to ensure these models' actual performance in field conditions. A
challenge is that the testing often needs to produce precise results with a
very limited budget for labeling data collected in field.
Viewing software testing as a practice of reliability estimation through
statistical sampling, we re-interpret the idea behind conventional structural
coverages as conditioning for variance reduction. With this insight we propose
an efficient DNN testing method based on the conditioning on the representation
learned by the DNN model under testing. The representation is defined by the
probability distribution of the output of neurons in the last hidden layer of
the model. To sample from this high dimensional distribution in which the
operational data are sparsely distributed, we design an algorithm leveraging
cross entropy minimization.
Experiments with various DNN models and datasets were conducted to evaluate
the general efficiency of the approach. The results show that, compared with
simple random sampling, this approach requires only about a half of labeled
inputs to achieve the same level of precision.Comment: Published in the Proceedings of the 27th ACM Joint European Software
Engineering Conference and Symposium on the Foundations of Software
Engineering (ESEC/FSE 2019