3 research outputs found
The Impact of Systematic Edits in History Slicing
While extracting a subset of a commit history, specifying the necessary
portion is a time-consuming task for developers. Several commit-based history
slicing techniques have been proposed to identify dependencies between commits
and to extract a related set of commits using a specific commit as a slicing
criterion. However, the resulting subset of commits become large if commits for
systematic edits whose changes do not depend on each other exist. We
empirically investigated the impact of systematic edits on history slicing. In
this study, commits in which systematic edits were detected are split between
each file so that unnecessary dependencies between commits are eliminated. In
several histories of open source systems, the size of history slices was
reduced by 13.3-57.2% on average after splitting the commits for systematic
edits.Comment: 5 pages, MSR 201
Method-Level Bug Severity Prediction using Source Code Metrics and LLMs
In the past couple of decades, significant research efforts are devoted to
the prediction of software bugs. However, most existing work in this domain
treats all bugs the same, which is not the case in practice. It is important
for a defect prediction method to estimate the severity of the identified bugs
so that the higher-severity ones get immediate attention. In this study, we
investigate source code metrics, source code representation using large
language models (LLMs), and their combination in predicting bug severity labels
of two prominent datasets. We leverage several source metrics at method-level
granularity to train eight different machine-learning models. Our results
suggest that Decision Tree and Random Forest models outperform other models
regarding our several evaluation metrics. We then use the pre-trained CodeBERT
LLM to study the source code representations' effectiveness in predicting bug
severity. CodeBERT finetuning improves the bug severity prediction results
significantly in the range of 29%-140% for several evaluation metrics, compared
to the best classic prediction model on source code metric. Finally, we
integrate source code metrics into CodeBERT as an additional input, using our
two proposed architectures, which both enhance the CodeBERT model
effectiveness