The Crowdsourced “Classics” and the Revealing Limits of Goodreads Data

Abstract

This presentation draws from forthcoming work on the Goodreads "classics." Goodreads is the largest social networking site for readers on the internet (90 million users) and a subsidiary of Amazon. The “classics” are one of the most active Goodreads categories, with some of the most rated and reviewed books across the entire site. Why are the classics so popular on Goodreads? Which books have readers “shelved” as classics most often? What do the classics mean to contemporary readers? We use computational methods such as topic modeling to investigate these questions and more. We also interrogate the limits of Goodreads data and the influence of Goodreads/Amazon's proprietary algorithms on reviews. We find that reviews sorted by the default algorithm, for example, tend to be longer, more socially conscientious (e.g. include a spoiler alert), and written by a smaller set of Goodreads users. Extrapolating from these findings, we argue that computational methods can provide a way of documenting, understanding, and critiquing algorithmic culture and its effects

    Similar works