Collecting quality data from software projects can be time-consuming and
expensive. Hence, some researchers explore "unsupervised" approaches to quality
prediction that does not require labelled data. An alternate technique is to
use "supervised" approaches that learn models from project data labelled with,
say, "defective" or "not-defective". Most researchers use these supervised
models since, it is argued, they can exploit more knowledge of the projects.
At FSE'16, Yang et al. reported startling results where unsupervised defect
predictors outperformed supervised predictors for effort-aware just-in-time
defect prediction. If confirmed, these results would lead to a dramatic
simplification of a seemingly complex task (data mining) that is widely
explored in the software engineering literature.
This paper repeats and refutes those results as follows. (1) There is much
variability in the efficacy of the Yang et al. predictors so even with their
approach, some supervised data is required to prune weaker predictors away.
(2)Their findings were grouped across N projects. When we repeat their
analysis on a project-by-project basis, supervised predictors are seen to work
better.
Even though this paper rejects the specific conclusions of Yang et al., we
still endorse their general goal. In our our experiments, supervised predictors
did not perform outstandingly better than unsupervised ones for effort-aware
just-in-time defect prediction. Hence, they may indeed be some combination of
unsupervised learners to achieve comparable performance to supervised ones. We
therefore encourage others to work in this promising area.Comment: 11 pages, 5 figures. Accepted at FSE201