Rating consistency is consistently underrated : An exploratory analysis of movie-tag rating inconsistency

Abstract

Publisher Copyright: © 2022 ACM.Content-based and hybrid recommender systems rely on item-tag ratings to make recommendations. An example of an item-tag rating is the degree to which the tag "comedy"applies to the movie "Back to the Future (1985)". Ratings are often generated by human annotators who can be inconsistent with one another. However, many recommender systems take item-tag ratings at face value, assuming them all to be equally valid. In this paper, we investigate the inconsistency of item-tag ratings together with contextual factors that could affect consistency in the movie domain. We conducted semi-structured interviews to identify potential reasons for rating inconsistency. Next, we used these reasons to design a survey, which we ran on Amazon Mechanical Turk. We collected 6,070 ratings from 665 annotators across 142 movies and 80 tags. Our analysis shows that ∼45% of ratings are inconsistent with the mode rating for a given movie-tag pair. We found that the single most important factor for rating inconsistency is the annotator's perceived ease of rating, suggesting that annotators are at least tacitly aware of the quality of their own ratings. We also found that subjective tags (e.g. "funny", "boring") are more inconsistent than objective tags (e.g. "robots", "aliens"), and are associated with lower tag familiarity and lower perceived ease of rating.Peer reviewe

    Similar works