Comments of online articles provide extended views and improve user
engagement. Automatically making comments thus become a valuable functionality
for online forums, intelligent chatbots, etc. This paper proposes the new task
of automatic article commenting, and introduces a large-scale Chinese dataset
with millions of real comments and a human-annotated subset characterizing the
comments' varying quality. Incorporating the human bias of comment quality, we
further develop automatic metrics that generalize a broad set of popular
reference-based metrics and exhibit greatly improved correlations with human
evaluations.Comment: ACL2018; with supplements; Dataset link available in the pape