Background: As improving code review (CR) effectiveness is a priority for
many software development organizations, projects have deployed CR analytics
platforms to identify potential improvement areas. The number of issues
identified, which is a crucial metric to measure CR effectiveness, can be
misleading if all issues are placed in the same bin. Therefore, a finer-grained
classification of issues identified during CRs can provide actionable insights
to improve CR effectiveness. Although a recent work by Fregnan et al. proposed
automated models to classify CR-induced changes, we have noticed two potential
improvement areas -- i) classifying comments that do not induce changes and ii)
using deep neural networks (DNN) in conjunction with code context to improve
performances. Aims: This study aims to develop an automated CR comment
classifier that leverages DNN models to achieve a more reliable performance
than Fregnan et al. Method: Using a manually labeled dataset of 1,828 CR
comments, we trained and evaluated supervised learning-based DNN models
leveraging code context, comment text, and a set of code metrics to classify CR
comments into one of the five high-level categories proposed by Turzo and Bosu.
Results: Based on our 10-fold cross-validation-based evaluations of multiple
combinations of tokenization approaches, we found a model using CodeBERT
achieving the best accuracy of 59.3%. Our approach outperforms Fregnan et al.'s
approach by achieving 18.7% higher accuracy. Conclusion: Besides facilitating
improved CR analytics, our proposed model can be useful for developers in
prioritizing code review feedback and selecting reviewers