1,938 research outputs found
An Automated Pipeline for Character and Relationship Extraction from Readers' Literary Book Reviews on Goodreads.com
Reader reviews of literary fiction on social media, especially those in
persistent, dedicated forums, create and are in turn driven by underlying
narrative frameworks. In their comments about a novel, readers generally
include only a subset of characters and their relationships, thus offering a
limited perspective on that work. Yet in aggregate, these reviews capture an
underlying narrative framework comprised of different actants (people, places,
things), their roles, and interactions that we label the "consensus narrative
framework". We represent this framework in the form of an actant-relationship
story graph. Extracting this graph is a challenging computational problem,
which we pose as a latent graphical model estimation problem. Posts and reviews
are viewed as samples of sub graphs/networks of the hidden narrative framework.
Inspired by the qualitative narrative theory of Greimas, we formulate a
graphical generative Machine Learning (ML) model where nodes represent actants,
and multi-edges and self-loops among nodes capture context-specific
relationships. We develop a pipeline of interlocking automated methods to
extract key actants and their relationships, and apply it to thousands of
reviews and comments posted on Goodreads.com. We manually derive the ground
truth narrative framework from SparkNotes, and then use word embedding tools to
compare relationships in ground truth networks with our extracted networks. We
find that our automated methodology generates highly accurate consensus
narrative frameworks: for our four target novels, with approximately 2900
reviews per novel, we report average coverage/recall of important relationships
of > 80% and an average edge detection rate of >89\%. These extracted narrative
frameworks can generate insight into how people (or classes of people) read and
how they recount what they have read to others
The Stylometric Processing of Sensory Open Source Data
This research project’s end goal is on the Lone Wolf Terrorist.
The project uses an exploratory approach to the
self-radicalisation problem by creating a stylistic fingerprint
of a person's personality, or self, from subtle characteristics
hidden in a person's writing style. It separates the identity of
one person from another based on their writing style. It also
separates the writings of suicide attackers from ‘normal'
bloggers by critical slowing down; a dynamical property used to
develop early warning signs of tipping points. It identifies
changes in a person's moods, or shifts from one state to another,
that might indicate a tipping point for self-radicalisation.
Research into authorship identity using personality is a
relatively new area in the field of neurolinguistics. There are
very few methods that model how an individual's cognitive
functions present themselves in writing. Here, we develop a
novel algorithm, RPAS, which draws on cognitive functions such as
aging, sensory processing, abstract or concrete thinking through
referential activity emotional experiences, and a person's
internal gender for identity. We use well-known techniques such
as Principal Component Analysis, Linear Discriminant Analysis,
and the Vector Space Method to cluster multiple
anonymous-authored works. Here we use a new approach, using
seriation with noise to separate subtle features in individuals.
We conduct time series analysis using modified variants of 1-lag
autocorrelation and the coefficient of skewness, two statistical
metrics that change near a tipping point, to track serious life
events in an individual through cognitive linguistic markers.
In our journey of discovery, we uncover secrets about the
Elizabethan playwrights hidden for over 400 years. We uncover
markers for depression and anxiety in modern-day writers and
identify linguistic cues for Alzheimer's disease much earlier
than other studies using sensory processing. In using these
techniques on the Lone Wolf, we can separate their writing style
used before their attacks that differs from other writing
Presenting GECO : an eyetracking corpus of monolingual and bilingual sentence reading
This paper introduces GECO, the Ghent Eye-tracking Corpus, a monolingual and bilingual corpus of eye-tracking data of participants reading a complete novel. English monolinguals and Dutch-English bilinguals read an entire novel, which was presented in paragraphs on the screen. The bilinguals read half of the novel in their first language, and the other half in their second language. In this paper we describe the distributions and descriptive statistics of the most important reading time measures for the two groups of participants. This large eye-tracking corpus is perfectly suited for both exploratory purposes as well as more directed hypothesis testing, and it can guide the formulation of ideas and theories about naturalistic reading processes in a meaningful context. Most importantly, this corpus has the potential to evaluate the generalizability of monolingual and bilingual language theories and models to reading of long texts and narratives
How robust is the language architecture? The case of mood
In neurocognitive research on language, the processing principles of the system at hand are usually assumed to be relatively invariant. However, research on attention, memory, decision-making, and social judgment has shown that mood can substantially modulate how the brain processes information. For example, in a bad mood, people typically have a narrower focus of attention and rely less on heuristics. In the face of such pervasive mood effects elsewhere in the brain, it seems unlikely that language processing would remain untouched. In an EEG experiment, we manipulated the mood of participants just before they read texts that confirmed or disconfirmed verb-based expectations about who would be talked about next (e.g., that “David praised Linda because … ” would continue about Linda, not David), or that respected or violated a syntactic agreement rule (e.g., “The boys turns”). ERPs showed that mood had little effect on syntactic parsing, but did substantially affect referential anticipation: whereas readers anticipated information about a specific person when they were in a good mood, a bad mood completely abolished such anticipation. A behavioral follow-up experiment suggested that a bad mood did not interfere with verb-based expectations per se, but prevented readers from using that information rapidly enough to predict upcoming reference on the fly, as the sentence unfolds. In all, our results reveal that background mood, a rather unobtrusive affective state, selectively changes a crucial aspect of real-time language processing. This observation fits well with other observed interactions between language processing and affect (emotions, preferences, attitudes, mood), and more generally testifies to the importance of studying “cold” cognitive functions in relation to “hot” aspects of the brain
The quality of writing tasks and students' use of academic language in Spanish
This study investigates the quality of the writing tasks assigned to native Spanish speakers in bilingual (Spanish-English) contexts, and the relationship between task quality and students' use of an academic register in their native language. Fifty-six language arts tasks were collected from 26 grade 4 and 5 teachers, and four student writing samples were collected in response to each task (N = 224). Multilevel modeling revealed that variation in students' use of key features of academic language in their writing was associated with the cognitive demand of writing tasks. Findings suggest that students' opportunities to respond to challenging tasks when writing in their native language are rare and that the rigor of writing tasks may relate to students' production and development of academic language. © 2012 by The University of Chicago. All rights reserved
Mobile phones and reading for enjoyment: evidence of use and behaviour change
A South African non-profit organisation, FunDza, launched a programme that delivers reading material via mobile phones. Computer log files of user activity over an eight-month period were analysed (N = 9,212,716), which showed that relatively large numbers of readers made use of the material (N = 65,533), and read a substantial amount of the material. We found evidence of positive shifts in reading behaviour. Further analysis showed that greater levels of participation in the programme were associated with greater enjoyment of reading. Furthermore, the longer participants read, the more confident they felt about their self-rated reading proficienc
- …