Automatic Unsupervised Bug Report Categorization

Abstract

2014 6th International Workshop on Empirical Software Engineering in Practice, 12-13 Nov. 2014, Osaka, JapanBackground: Information in bug reports is implicit and therefore difficult to comprehend. To extract its meaning, some processes are required. Categorizing bug reports is a technique that can help in this regard. It can be used to help in the bug reports management or to understand the underlying structure of the desired project. However, most researches in this area are focusing on a supervised learning approach that still requires a lot of human afford to prepare a training data. Aims: Our aim is to develop an automated framework than can categorize bug reports, according to their hidden characteristics and structures, without the needed for training data. Method: We solve this problem using clustering, unsupervised learning approach. It can automatically group bug reports together based on their textual similarity. We also propose a novel method to label each group with meaningful and representative names. Results: Experiment results show that our framework can achieve performance comparable to the supervised learning approaches. We also show that our labeling process can label each cluster with representative names according to its characteristic. Conclusion: Our framework could be used as an automated categorization system that can be applied without prior knowledge or as an automated labeling suggestion system

Similar works

This paper was published in NAIST Academic Repository.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.