2 research outputs found

    Software defect prediction using maximal information coefficient and fast correlation-based filter feature selection

    Get PDF
    Software quality ensures that applications that are developed are failure free. Some modern systems are intricate, due to the complexity of their information processes. Software fault prediction is an important quality assurance activity, since it is a mechanism that correctly predicts the defect proneness of modules and classifies modules that saves resources, time and developers’ efforts. In this study, a model that selects relevant features that can be used in defect prediction was proposed. The literature was reviewed and it revealed that process metrics are better predictors of defects in version systems and are based on historic source code over time. These metrics are extracted from the source-code module and include, for example, the number of additions and deletions from the source code, the number of distinct committers and the number of modified lines. In this research, defect prediction was conducted using open source software (OSS) of software product line(s) (SPL), hence process metrics were chosen. Data sets that are used in defect prediction may contain non-significant and redundant attributes that may affect the accuracy of machine-learning algorithms. In order to improve the prediction accuracy of classification models, features that are significant in the defect prediction process are utilised. In machine learning, feature selection techniques are applied in the identification of the relevant data. Feature selection is a pre-processing step that helps to reduce the dimensionality of data in machine learning. Feature selection techniques include information theoretic methods that are based on the entropy concept. This study experimented the efficiency of the feature selection techniques. It was realised that software defect prediction using significant attributes improves the prediction accuracy. A novel MICFastCR model, which is based on the Maximal Information Coefficient (MIC) was developed to select significant attributes and Fast Correlation Based Filter (FCBF) to eliminate redundant attributes. Machine learning algorithms were then run to predict software defects. The MICFastCR achieved the highest prediction accuracy as reported by various performance measures.School of ComputingPh. D. (Computer Science

    Performance of Defect Prediction in Rapidly Evolving Software

    No full text
    Defect prediction techniques allow spotting modules (or commits) likely to contain (introduce) a defect by training models with product or process metrics – thus supporting testing, code integration, and release decisions. When applied to processes where software changes rapidly, conventional techniques might fail, as trained models are not thought to evolve along with the software. In this study, we analyze the performance of defect prediction in rapidly evolving software. Framed in a high commit frequency context, we set up an approach to continuously refine prediction models by using new commit data, and predict whether or not an attempted commit is going to introduce a bug. An experiment is set up on the Eclipse JDT software to assess the prediction ability trend. Results enable to leverage defect prediction potentials in modern development paradigms with short release cycle and high code variability
    corecore