3,817 research outputs found
Software defect prediction: do different classifiers find the same defects?
Open Access: This article is distributed under the terms of the Creative Commons Attribution 4.0 International License CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.During the last 10 years, hundreds of different defect prediction models have been published. The performance of the classifiers used in these models is reported to be similar with models rarely performing above the predictive performance ceiling of about 80% recall. We investigate the individual defects that four classifiers predict and analyse the level of prediction uncertainty produced by these classifiers. We perform a sensitivity analysis to compare the performance of Random Forest, Naïve Bayes, RPart and SVM classifiers when predicting defects in NASA, open source and commercial datasets. The defect predictions that each classifier makes is captured in a confusion matrix and the prediction uncertainty of each classifier is compared. Despite similar predictive performance values for these four classifiers, each detects different sets of defects. Some classifiers are more consistent in predicting defects than others. Our results confirm that a unique subset of defects can be detected by specific classifiers. However, while some classifiers are consistent in the predictions they make, other classifiers vary in their predictions. Given our results, we conclude that classifier ensembles with decision-making strategies not based on majority voting are likely to perform best in defect prediction.Peer reviewedFinal Published versio
Data-Driven Shape Analysis and Processing
Data-driven methods play an increasingly important role in discovering
geometric, structural, and semantic relationships between 3D shapes in
collections, and applying this analysis to support intelligent modeling,
editing, and visualization of geometric data. In contrast to traditional
approaches, a key feature of data-driven approaches is that they aggregate
information from a collection of shapes to improve the analysis and processing
of individual shapes. In addition, they are able to learn models that reason
about properties and relationships of shapes without relying on hard-coded
rules or explicitly programmed instructions. We provide an overview of the main
concepts and components of these techniques, and discuss their application to
shape classification, segmentation, matching, reconstruction, modeling and
exploration, as well as scene analysis and synthesis, through reviewing the
literature and relating the existing works with both qualitative and numerical
comparisons. We conclude our report with ideas that can inspire future research
in data-driven shape analysis and processing.Comment: 10 pages, 19 figure
Evaluating prediction systems in software project estimation
This is the Pre-print version of the Article - Copyright @ 2012 ElsevierContext: Software engineering has a problem in that when we empirically evaluate competing prediction systems we obtain conflicting results.
Objective: To reduce the inconsistency amongst validation study results and provide a more formal foundation to interpret results with a particular focus on continuous prediction systems.
Method: A new framework is proposed for evaluating competing prediction systems based upon (1) an unbiased statistic, Standardised Accuracy, (2) testing the result likelihood relative to the baseline technique of random ‘predictions’, that is guessing, and (3) calculation of effect sizes.
Results: Previously published empirical evaluations of prediction systems are re-examined and the original conclusions shown to be unsafe. Additionally, even the strongest results are shown to have no more than a medium effect size relative to random guessing.
Conclusions: Biased accuracy statistics such as MMRE are deprecated. By contrast this new empirical validation framework leads to meaningful results. Such steps will assist in performing future meta-analyses and in providing more robust and usable recommendations to practitioners.Martin Shepperd was supported by the UK Engineering and Physical Sciences Research Council (EPSRC) under Grant EP/H050329
The impact of using biased performance metrics on software defect prediction research
Context: Software engineering researchers have undertaken many experiments
investigating the potential of software defect prediction algorithms.
Unfortunately, some widely used performance metrics are known to be
problematic, most notably F1, but nevertheless F1 is widely used.
Objective: To investigate the potential impact of using F1 on the validity of
this large body of research.
Method: We undertook a systematic review to locate relevant experiments and
then extract all pairwise comparisons of defect prediction performance using F1
and the un-biased Matthews correlation coefficient (MCC).
Results: We found a total of 38 primary studies. These contain 12,471 pairs
of results. Of these, 21.95% changed direction when the MCC metric is used
instead of the biased F1 metric. Unfortunately, we also found evidence
suggesting that F1 remains widely used in software defect prediction research.
Conclusions: We reiterate the concerns of statisticians that the F1 is a
problematic metric outside of an information retrieval context, since we are
concerned about both classes (defect-prone and not defect-prone units). This
inappropriate usage has led to a substantial number (more than one fifth) of
erroneous (in terms of direction) results. Therefore we urge researchers to (i)
use an unbiased metric and (ii) publish detailed results including confusion
matrices such that alternative analyses become possible.Comment: Submitted to the journal Information & Software Technology. It is a
greatly extended version of "Assessing Software Defection Prediction
Performance: Why Using the Matthews Correlation Coefficient Matters"
presented at EASE 202
- …