423 research outputs found
An Exploratory Study of Field Failures
Field failures, that is, failures caused by faults that escape the testing
phase leading to failures in the field, are unavoidable. Improving verification
and validation activities before deployment can identify and timely remove many
but not all faults, and users may still experience a number of annoying
problems while using their software systems. This paper investigates the nature
of field failures, to understand to what extent further improving in-house
verification and validation activities can reduce the number of failures in the
field, and frames the need of new approaches that operate in the field. We
report the results of the analysis of the bug reports of five applications
belonging to three different ecosystems, propose a taxonomy of field failures,
and discuss the reasons why failures belonging to the identified classes cannot
be detected at design time but shall be addressed at runtime. We observe that
many faults (70%) are intrinsically hard to detect at design-time
An Exploratory Study of Field Failures
Field failures, that is, failures caused by faults that escape the testing
phase leading to failures in the field, are unavoidable. Improving verification
and validation activities before deployment can identify and timely remove many
but not all faults, and users may still experience a number of annoying
problems while using their software systems. This paper investigates the nature
of field failures, to understand to what extent further improving in-house
verification and validation activities can reduce the number of failures in the
field, and frames the need of new approaches that operate in the field. We
report the results of the analysis of the bug reports of five applications
belonging to three different ecosystems, propose a taxonomy of field failures,
and discuss the reasons why failures belonging to the identified classes cannot
be detected at design time but shall be addressed at runtime. We observe that
many faults (70%) are intrinsically hard to detect at design-time
Too Trivial To Test? An Inverse View on Defect Prediction to Identify Methods with Low Fault Risk
Background. Test resources are usually limited and therefore it is often not
possible to completely test an application before a release. To cope with the
problem of scarce resources, development teams can apply defect prediction to
identify fault-prone code regions. However, defect prediction tends to low
precision in cross-project prediction scenarios.
Aims. We take an inverse view on defect prediction and aim to identify
methods that can be deferred when testing because they contain hardly any
faults due to their code being "trivial". We expect that characteristics of
such methods might be project-independent, so that our approach could improve
cross-project predictions.
Method. We compute code metrics and apply association rule mining to create
rules for identifying methods with low fault risk. We conduct an empirical
study to assess our approach with six Java open-source projects containing
precise fault data at the method level.
Results. Our results show that inverse defect prediction can identify approx.
32-44% of the methods of a project to have a low fault risk; on average, they
are about six times less likely to contain a fault than other methods. In
cross-project predictions with larger, more diversified training sets,
identified methods are even eleven times less likely to contain a fault.
Conclusions. Inverse defect prediction supports the efficient allocation of
test resources by identifying methods that can be treated with less priority in
testing activities and is well applicable in cross-project prediction
scenarios.Comment: Submitted to PeerJ C
Fairness Testing: Testing Software for Discrimination
This paper defines software fairness and discrimination and develops a
testing-based method for measuring if and how much software discriminates,
focusing on causality in discriminatory behavior. Evidence of software
discrimination has been found in modern software systems that recommend
criminal sentences, grant access to financial products, and determine who is
allowed to participate in promotions. Our approach, Themis, generates efficient
test suites to measure discrimination. Given a schema describing valid system
inputs, Themis generates discrimination tests automatically and does not
require an oracle. We evaluate Themis on 20 software systems, 12 of which come
from prior work with explicit focus on avoiding discrimination. We find that
(1) Themis is effective at discovering software discrimination, (2)
state-of-the-art techniques for removing discrimination from algorithms fail in
many situations, at times discriminating against as much as 98% of an input
subdomain, (3) Themis optimizations are effective at producing efficient test
suites for measuring discrimination, and (4) Themis is more efficient on
systems that exhibit more discrimination. We thus demonstrate that fairness
testing is a critical aspect of the software development cycle in domains with
possible discrimination and provide initial tools for measuring software
discrimination.Comment: Sainyam Galhotra, Yuriy Brun, and Alexandra Meliou. 2017. Fairness
Testing: Testing Software for Discrimination. In Proceedings of 2017 11th
Joint Meeting of the European Software Engineering Conference and the ACM
SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE),
Paderborn, Germany, September 4-8, 2017 (ESEC/FSE'17).
https://doi.org/10.1145/3106237.3106277, ESEC/FSE, 201
- …