News recommendation plays a critical role in shaping the public's worldviews
through the way in which it filters and disseminates information about
different topics. Given the crucial impact that media plays in opinion
formation, especially for sensitive topics, understanding the effects of
personalized recommendation beyond accuracy has become essential in today's
digital society. In this work, we present NeMig, a bilingual news collection on
the topic of migration, and corresponding rich user data. In comparison to
existing news recommendation datasets, which comprise a large variety of
monolingual news, NeMig covers articles on a single controversial topic,
published in both Germany and the US. We annotate the sentiment polarization of
the articles and the political leanings of the media outlets, in addition to
extracting subtopics and named entities disambiguated through Wikidata. These
features can be used to analyze the effects of algorithmic news curation beyond
accuracy-based performance, such as recommender biases and the creation of
filter bubbles. We construct domain-specific knowledge graphs from the news
text and metadata, thus encoding knowledge-level connections between articles.
Importantly, while existing datasets include only click behavior, we collect
user socio-demographic and political information in addition to explicit click
feedback. We demonstrate the utility of NeMig through experiments on the tasks
of news recommenders benchmarking, analysis of biases in recommenders, and news
trends analysis. NeMig aims to provide a useful resource for the news
recommendation community and to foster interdisciplinary research into the
multidimensional effects of algorithmic news curation.Comment: Accepted at the 11th International Workshop on News Recommendation
and Analytics (INRA 2023) in conjunction with ACM RecSys 202