Reader reviews of literary fiction on social media, especially those in
persistent, dedicated forums, create and are in turn driven by underlying
narrative frameworks. In their comments about a novel, readers generally
include only a subset of characters and their relationships, thus offering a
limited perspective on that work. Yet in aggregate, these reviews capture an
underlying narrative framework comprised of different actants (people, places,
things), their roles, and interactions that we label the "consensus narrative
framework". We represent this framework in the form of an actant-relationship
story graph. Extracting this graph is a challenging computational problem,
which we pose as a latent graphical model estimation problem. Posts and reviews
are viewed as samples of sub graphs/networks of the hidden narrative framework.
Inspired by the qualitative narrative theory of Greimas, we formulate a
graphical generative Machine Learning (ML) model where nodes represent actants,
and multi-edges and self-loops among nodes capture context-specific
relationships. We develop a pipeline of interlocking automated methods to
extract key actants and their relationships, and apply it to thousands of
reviews and comments posted on Goodreads.com. We manually derive the ground
truth narrative framework from SparkNotes, and then use word embedding tools to
compare relationships in ground truth networks with our extracted networks. We
find that our automated methodology generates highly accurate consensus
narrative frameworks: for our four target novels, with approximately 2900
reviews per novel, we report average coverage/recall of important relationships
of > 80% and an average edge detection rate of >89\%. These extracted narrative
frameworks can generate insight into how people (or classes of people) read and
how they recount what they have read to others