Experience in Predicting Fault-Prone Software Modules Using Complexity Metrics

Abstract

Complexity metrics have been intensively studied in predicting fault-prone software modules. However, little work is done in studying how to effectively use the complexity metrics and the prediction models under realistic conditions. In this paper, we present a study showing how to utilize the prediction models generated from existing projects to improve the fault detection on other projects. The binary logistic regression method is used in studying publicly available data of five commercial products. Our study shows (1) models generated using more datasets can improve the prediction accuracy but not the recall rate; (2) lowering the cut-off value can improve the recall rate, but the number of false positives will be increased, which will result in higher maintenance effort. We further suggest that in order to improve model prediction efficiency, the selection of source datasets and the determination of cut-off values should be based on specific properties of a project. So far, there are no general rules that have been found and reported to follow

    Similar works