NovelTM Datasets for English-Language Fiction, 1700-2009

Abstract

This report describes a collection of 210,305 volumes of fiction that researchers are encouraged to borrow for their own work. Alternately, readers can simply browse the report as a description of English-language fiction in HathiTrust Digital Library. For instance, how does the proportion of fiction written by British authors or by women change across time? We also divide nineteenth- and twentieth-century fiction into seven subsets with different emphases (for instance, one where men and women are represented equally, and one composed of only the most prominent and widely-held books). Comparing the pictures produced by these different samples allows us to assess the fragility of recent quantitative arguments about literary history. Preprint version of an article to appear in the Journal of Cultural Analytics

    Similar works