8 research outputs found
An Automated Pipeline for Character and Relationship Extraction from Readers' Literary Book Reviews on Goodreads.com
Reader reviews of literary fiction on social media, especially those in
persistent, dedicated forums, create and are in turn driven by underlying
narrative frameworks. In their comments about a novel, readers generally
include only a subset of characters and their relationships, thus offering a
limited perspective on that work. Yet in aggregate, these reviews capture an
underlying narrative framework comprised of different actants (people, places,
things), their roles, and interactions that we label the "consensus narrative
framework". We represent this framework in the form of an actant-relationship
story graph. Extracting this graph is a challenging computational problem,
which we pose as a latent graphical model estimation problem. Posts and reviews
are viewed as samples of sub graphs/networks of the hidden narrative framework.
Inspired by the qualitative narrative theory of Greimas, we formulate a
graphical generative Machine Learning (ML) model where nodes represent actants,
and multi-edges and self-loops among nodes capture context-specific
relationships. We develop a pipeline of interlocking automated methods to
extract key actants and their relationships, and apply it to thousands of
reviews and comments posted on Goodreads.com. We manually derive the ground
truth narrative framework from SparkNotes, and then use word embedding tools to
compare relationships in ground truth networks with our extracted networks. We
find that our automated methodology generates highly accurate consensus
narrative frameworks: for our four target novels, with approximately 2900
reviews per novel, we report average coverage/recall of important relationships
of > 80% and an average edge detection rate of >89\%. These extracted narrative
frameworks can generate insight into how people (or classes of people) read and
how they recount what they have read to others
Recommended from our members
Conspiracy in the Time of Corona: Automatic detection of Emerging Covid-19 Conspiracy Theories in Social Media and the News
Abstract
Rumors and conspiracy theories thrive in environments of low confi- dence and low trust. Consequently, it is not surprising that ones related to the Covid-19 pandemic are proliferating given the lack of scientific consensus on the virus’s spread and containment, or on the long term social and economic ramifications of the pandemic. Among the stories currently circulating are ones suggesting that the 5G telecommunication network activates the virus, that the pandemic is a hoax perpetrated by a global cabal, that the virus is a bio-weapon released deliberately by the Chinese, or that Bill Gates is using it as cover to launch a broad vaccination program to facilitate a global surveillance regime. While some may be quick to dismiss these stories as having little impact on real-world behavior, recent events including the destruction of cell phone towers, racially fueled attacks against Asian Americans, demonstrations espousing resistance to public health orders, and wide-scale defiance of scientifically sound public mandates such as those to wear masks and practice social distancing, countermand such conclusions. Inspired by narrative theory, we crawl social media sites and news reports and, through the application of automated machine-learning methods, discover the underlying narrative frame- works supporting the generation of rumors and conspiracy theories. We show how the various narrative frameworks fueling these stories rely on the alignment of otherwise disparate domains of knowledge, and consider how they attach to the broader reporting on the pandemic. These alignments and attachments, which can be monitored in near real-time, may be useful for identifying areas in the news that are particularly vulnerable to reinterpretation by conspiracy theorists. Understanding the dynamics of storytelling on social media and the narrative frameworks that provide the generative basis for these stories may also be helpful for devising methods to disrupt their spread
Recommended from our members
An Automated Pipeline for Character and Relationship Extraction from Readers Literary Book Reviews on Goodreads.com
Reader reviews of literary fiction on social media, especially those in persistent, dedicated forums, create and are in turn driven by underlying narrative frameworks. In their comments about a novel, readers generally include only a subset of characters and their relationships, thus offering a limited perspective on that work. Yet in aggregate, these reviews capture an underlying narrative framework comprised of different actants (people, places, things), their roles, and interactions that we label the "consensus narrative framework". We represent this framework in the form of an actant-relationship story graph. Extracting this graph is a challenging computational problem, which we pose as a latent graphical model estimation problem. Posts and reviews are viewed as samples of sub graphs/networks of the hidden narrative framework. Inspired by the qualitative narrative theory of Greimas, we formulate a graphical generative Machine Learning (ML) model where nodes represent actants, and multi-edges and self-loops among nodes capture context-specific relationships. We develop a pipeline of interlocking automated methods to extract key actants and their relationships, and apply it to thousands of reviews and comments posted on Goodreads.com. We manually derive the ground truth narrative framework from SparkNotes, and then use word embedding tools to compare relationships in ground truth networks with our extracted networks. We find that our automated methodology generates highly accurate consensus narrative frameworks: for our four target novels, with approximately 2900 reviews per novel, we report average coverage/recall of important relationships of >80% and an average edge detection rate of >89%. These extracted narrative frameworks can generate insight into how people (or classes of people) read and how they recount what they have read to others.
PROSIDING SEMINAR TAHUNAN LINGUISTIK UNIVERSITAS PENDIDIKAN INDONESIA (SETALI 2018) TINGKAT INTERNASIONAL : Language in the Digital Era: Opportunities or Threats?
Seminar Tahunan Linguistik yang lazim disebut SETALI merupakan ajang seminar tahunan yang diselenggarakan oleh Program Studi Linguistik Sekolah Pascasarjana Universitas Pendidikan Indonesia (SPs UPI) bekerja sama dengan organisasi profesi Masyarakat Linguistik Indonesia (MLI) komisariat UPI. Pada 2018 ini, seminar kembali digelar pada 5-6 Mei bertemakan “Bahasa di Era Digital: Peluang atau Ancaman?”. Pengusungan tema kali ini beranjak dari fenomena khas terkait bahasa di era digital yang turut mengambil peran penting di dalam pengaplikasiannya. Ada sekitar 200 makalah terpilih yang dimuat untuk dibentangkan dalam Setali 2018. Makalah-makalah yang terhimpun dalam prosiding ini telah diseleksi melalui proses panjang dan pertimbangan yang cukup cermat. Bahasa dan digitalisasi adalah dua hal yang saling berkait dan tidak terpisahkan. Pemakaian bahasa di ruang digital, pada berbagai media, menimbulkan berbagai varian. Penggunaan bahasa dalam komunikasi di era digital, terkadang sesuai dengan bentuk yang baik (well-form), namun tak jarang juga tampil menyimpang (unwell-form). Banyaknya penyimpangan yang terjadi dalam konteks penggunaaan bahasa di ruang digital berpotensi menimbulkan efek negatif yang dapat mempengaruhi sikap bahasa pengguna bahasa Indonesia secara umum. Terkait dengan hal tersebut, masyarakat diharapkan cermat dalam menyikapi berbagai fenomena penggunaan bahasa yang sulit terbendung. Sekalipun ada banyak ancaman terhadap eksistensi bahasa di era ini, tidak dipungkiri juga ada banyak peluang yang dapat dipilih oleh masyarakat pengguna bahasa sebagai hal yang positif dan menguntungkan. Setakat ini, muncul berbagai polemik dalam dunia linguistik terkait masalah kebahasaan yang merebak di dunia digital. Para penggiat bahasa diharapkan banyak melakukan penelaahan terhadap praktik dan peran bahasa di era digital ini. Tema “Bahasa di Era Digital: Peluang atau Ancaman?” ini diharapkan mampu mewadahi semua elemen masyarakat untuk berpatisipasi dan ikut andil dalam menilai dan menelisik kedudukan bahasa dari sudut pandang yang beraneka ragam sehingga dapat melahirkan beraragamnya perspektif di jagat linguistik Indonesia. Akhir kata, dengan memohon petunjuk dan keridhaan Allah Swt., saya berharap agar penyelenggaraan Setali 2018 ini dapat berjalan dengan tertib dan lancar. Selain itu, saya juga berharap semoga dokumentasi akademik seperti ini dapat memberikan kontribusi nyata bagi perkembangan linguistik di Indonesia. Dalam kesempatan ini, saya merasa perlu untuk mengucapkan terima kasih kepada para pihak yang telah turut serta membantu terlaksananya Setali 2018 ini berjalan dengan baik. Selamat berseminar