In this work, we introduce a pioneering research challenge: evaluating
positive and potentially harmful messages within music products. We initiate by
setting a multi-faceted, multi-task benchmark for music content assessment.
Subsequently, we introduce an efficient multi-task predictive model fortified
with ordinality-enforcement to address this challenge. Our findings reveal that
the proposed method not only significantly outperforms robust task-specific
alternatives but also possesses the capability to assess multiple aspects
simultaneously. Furthermore, through detailed case studies, where we employed
Large Language Models (LLMs) as surrogates for content assessment, we provide
valuable insights to inform and guide future research on this topic. The code
for dataset creation and model implementation is publicly available at
https://github.com/RiTUAL-UH/music-message-assessment.Comment: Accepted at LREC-COLING 2024 (long paper