29 research outputs found
Amortising the Cost of Mutation Based Fault Localisation using Statistical Inference
Mutation analysis can effectively capture the dependency between source code
and test results. This has been exploited by Mutation Based Fault Localisation
(MBFL) techniques. However, MBFL techniques suffer from the need to expend the
high cost of mutation analysis after the observation of failures, which may
present a challenge for its practical adoption. We introduce SIMFL (Statistical
Inference for Mutation-based Fault Localisation), an MBFL technique that allows
users to perform the mutation analysis in advance against an earlier version of
the system. SIMFL uses mutants as artificial faults and aims to learn the
failure patterns among test cases against different locations of mutations.
Once a failure is observed, SIMFL requires either almost no or very small
additional cost for analysis, depending on the used inference model. An
empirical evaluation of SIMFL using 355 faults in Defects4J shows that SIMFL
can successfully localise up to 103 faults at the top, and 152 faults within
the top five, on par with state-of-the-art alternatives. The cost of mutation
analysis can be further reduced by mutation sampling: SIMFL retains over 80% of
its localisation accuracy at the top rank when using only 10% of generated
mutants, compared to results obtained without sampling
Learning Test-Mutant Relationship for Accurate Fault Localisation
Context: Automated fault localisation aims to assist developers in the task
of identifying the root cause of the fault by narrowing down the space of
likely fault locations. Simulating variants of the faulty program called
mutants, several Mutation Based Fault Localisation (MBFL) techniques have been
proposed to automatically locate faults. Despite their success, existing MBFL
techniques suffer from the cost of performing mutation analysis after the fault
is observed. Method: To overcome this shortcoming, we propose a new MBFL
technique named SIMFL (Statistical Inference for Mutation-based Fault
Localisation). SIMFL localises faults based on the past results of mutation
analysis that has been done on the earlier version in the project history,
allowing developers to make predictions on the location of incoming faults in a
just-in-time manner. Using several statistical inference methods, SIMFL models
the relationship between test results of the mutants and their locations, and
subsequently infers the location of the current faults. Results: The empirical
study on Defects4J dataset shows that SIMFL can localise 113 faults on the
first rank out of 224 faults, outperforming other MBFL techniques. Even when
SIMFL is trained on the predicted kill matrix, SIMFL can still localise 95
faults on the first rank out of 194 faults. Moreover, removing redundant
mutants significantly improves the localisation accuracy of SIMFL by the number
of faults localised at the first rank up to 51. Conclusion: This paper proposes
a new MBFL technique called SIMFL, which exploits ahead-of-time mutation
analysis to localise current faults. SIMFL is not only cost-effective, as it
does not need a mutation analysis after the fault is observed, but also capable
of localising faults accurately.Comment: Paper accepted for publication at IST. arXiv admin note: substantial
text overlap with arXiv:1902.0972
Challenges in Survey Research
While being an important and often used research method, survey research has
been less often discussed on a methodological level in empirical software
engineering than other types of research. This chapter compiles a set of
important and challenging issues in survey research based on experiences with
several large-scale international surveys. The chapter covers theory building,
sampling, invitation and follow-up, statistical as well as qualitative analysis
of survey data and the usage of psychometrics in software engineering surveys.Comment: Accepted version of chapter in the upcoming book on Contemporary
Empirical Methods in Software Engineering. Update includes revision of typos
and additional figures. Last update includes fixing two small issues and
typo
Learning test-mutant relationship for accurate fault localisation
Context: Automated fault localisation aims to assist developers in the task of identifying the root cause of the fault by narrowing down the space of likely fault locations. Simulating variants of the faulty program called mutants, several Mutation Based Fault Localisation (MBFL) techniques have been proposed to automatically locate faults. Despite their success, existing MBFL techniques suffer from the cost of performing mutation analysis after the fault is observed. Method: To overcome this shortcoming, we propose a new MBFL technique named SIMFL (Statistical Inference for Mutation-based Fault Localisation). SIMFL localises faults based on the past results of mutation analysis that has been done on the earlier version in the project history, allowing developers to make predictions on the location of incoming faults in a just-in-time manner. Using several statistical inference methods, SIMFL models the relationship between test results of the mutants and their locations, and subsequently infers the location of the current faults. Results: The empirical study on DEFECTS4J dataset shows that SIMFL can localise 113 faults on the first rank out of 224 faults, outperforming other MBFL techniques. Even when SIMFL is trained on the predicted kill matrix, SIMFL can still localise 95 faults on the first rank out of 194 faults. Moreover, removing redundant mutants significantly improves the localisation accuracy of SIMFL by the number of faults localised at the first rank up to 51. Conclusion: This paper proposes a new MBFL technique called SIMFL, which exploits ahead-of-time mutation analysis to localise current faults. SIMFL is not only cost-effective, as it does not need a mutation analysis after the fault is observed, but also capable of localising faults accurately