The success of a Pull Request (PR) depends on the responsiveness of the
maintainers and the contributor during the review process. Being aware of the
expected waiting times can lead to better interactions and managed expectations
for both the maintainers and the contributor. In this paper, we propose a
machine-learning approach to predict the first response latency of the
maintainers following the submission of a PR, and the first response latency of
the contributor after receiving the first response from the maintainers. We
curate a dataset of 20 large and popular open-source projects on GitHub and
extract 21 features to characterize projects, contributors, PRs, and review
processes. Using these features, we then evaluate seven types of classifiers to
identify the best-performing models. We also perform permutation feature
importance and SHAP analyses to understand the importance and impact of
different features on the predicted response latencies. Our best-performing
models achieve an average improvement of 33% in AUC-ROC and 58% in AUC-PR for
maintainers, as well as 42% in AUC-ROC and 95% in AUC-PR for contributors
compared to a no-skilled classifier across the projects. Our findings indicate
that PRs submitted earlier in the week, containing an average or slightly
above-average number of commits, and with concise descriptions are more likely
to receive faster first responses from the maintainers. Similarly, PRs with a
lower first response latency from maintainers, that received the first response
of maintainers earlier in the week, and containing an average or slightly
above-average number of commits tend to receive faster first responses from the
contributors. Additionally, contributors with a higher acceptance rate and a
history of timely responses in the project are likely to both obtain and
provide faster first responses.Comment: Manuscript submitted to IEEE Transactions on Software Engineering
(TSE