12 research outputs found
QUOTUS: The Structure of Political Media Coverage as Revealed by Quoting Patterns
Given the extremely large pool of events and stories available, media outlets
need to focus on a subset of issues and aspects to convey to their audience.
Outlets are often accused of exhibiting a systematic bias in this selection
process, with different outlets portraying different versions of reality.
However, in the absence of objective measures and empirical evidence, the
direction and extent of systematicity remains widely disputed.
In this paper we propose a framework based on quoting patterns for
quantifying and characterizing the degree to which media outlets exhibit
systematic bias. We apply this framework to a massive dataset of news articles
spanning the six years of Obama's presidency and all of his speeches, and
reveal that a systematic pattern does indeed emerge from the outlet's quoting
behavior. Moreover, we show that this pattern can be successfully exploited in
an unsupervised prediction setting, to determine which new quotes an outlet
will select to broadcast. By encoding bias patterns in a low-rank space we
provide an analysis of the structure of political media coverage. This reveals
a latent media bias space that aligns surprisingly well with political ideology
and outlet type. A linguistic analysis exposes striking differences across
these latent dimensions, showing how the different types of media outlets
portray different realities even when reporting on the same events. For
example, outlets mapped to the mainstream conservative side of the latent space
focus on quotes that portray a presidential persona disproportionately
characterized by negativity.Comment: To appear in the Proceedings of WWW 2015. 11pp, 10 fig. Interactive
visualization, data, and other info available at
http://snap.stanford.edu/quotus
Two Computational Models for Analyzing Political Attention in Social Media
Understanding how political attention is divided and over what subjects is crucial for research on areas such as agenda setting, framing, and political rhetoric. However, existing methods for measuring attention, such as manual labeling ac- cording to established codebooks, are expensive and restric- tive. We describe two computational models that automati- cally distinguish topics in politicians’ social media content. Our models - one supervised classifier and one unsupervised topic model - provide different benefits. The supervised clas- sifier reduces the labor required to classify content accord- ing to pre-determined topic lists. However, tweets do more than communicate policy positions. Our unsupervised model uncovers both political topics and other Twitter uses (e.g., constituent service). Together, these models are effective, in- expensive computational tools for political communication and social media research. We demonstrate their utility and discuss the different analyses they afford by applying both models to the tweets posted by members of the 115th U.S. Congress.This material is based upon work supported by the National Science Foundation under Grant No. 1822228.https://deepblue.lib.umich.edu/bitstream/2027.42/147460/6/Hemphill and Schopke - Two Compuational Models.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/147460/1/Hemphill and Schopke - Two Computational Models.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/147460/8/ICWSM 2020 Two Computational Models.pptx5056Description of Hemphill and Schopke - Two Compuational Models.pdf : Revised articleDescription of Hemphill and Schopke - Two Computational Models.pdf : Main articleDescription of ICWSM 2020 Two Computational Models.pptx : Presentation with scrip
All Things Considered: Detecting Partisan Events from News Media with Cross-Article Comparison
Public opinion is shaped by the information news media provide, and that
information in turn may be shaped by the ideological preferences of media
outlets. But while much attention has been devoted to media bias via overt
ideological language or topic selection, a more unobtrusive way in which the
media shape opinion is via the strategic inclusion or omission of partisan
events that may support one side or the other. We develop a latent
variable-based framework to predict the ideology of news articles by comparing
multiple articles on the same story and identifying partisan events whose
inclusion or omission reveals ideology. Our experiments first validate the
existence of partisan event selection, and then show that article alignment and
cross-document comparison detect partisan events and article ideology better
than competitive baselines. Our results reveal the high-level form of media
bias, which is present even among mainstream media with strong norms of
objectivity and nonpartisanship. Our codebase and dataset are available at
https://github.com/launchnlp/ATC.Comment: EMNLP'23 Main Conferenc
Getting the agenda right: measuring media agenda using topic models
Agenda setting is the theory of how issue salience is transferred from the media to media audience. An agenda-setting study requires one to define a set of issues and to measure their salience. We propose a semisupervised approach based on topic modeling for exploring a news corpus and measuring the media agenda by tagging news articles with issues. The approach relies on an off-the-shelf Latent Dirichlet Allocation topic model, manual labeling of topics, and topic model customization. In preliminary evaluation, the tagger achieves a micro F1-score of 0.85 and outperforms the supervised baselines, suggesting that it could be successfully used for agenda-setting studies
Debating Debate: Measuring Discursive Overlap on the Congressional Floor
The study of how elites communicate to each other is an understudied topic largely because we lack a viable, large-scale, measure of discursive overlap. Discursive overlap is the extent to which parties and partisans talk to and past each other. In this paper, I introduce a repurposed measure - cosine similarity scores - and a method of measurement that concisely quantifies discursive overlap. I compare this measure to two others - overlap coefficients and Wordfish scores Slapin and Proksch (2008). To compare the scores, I first examine the distribution of the scores and then compare how well each does in a series of tests, including how well each reflects reality and how well each responds to different aspects of communication that increase or decrease discursive overlap. Throughout the paper, I use the 2008 Farm Bill as an ongoing case. I conclude that cosine similarity scores do indeed capture discursive overlap and show that it is the best measure among the three considered.Master of Art
Recommended from our members
Design and Empirical Evaluation of Interactive and Interpretable Machine Learning
Machine learning is ubiquitous in making predictions that affect people's decisions. While most of the research in machine learning focuses on improving the performance of the models on held-out data sets, this is not enough to convince end-users that these models are trustworthy or reliable in the wild. To address this problem, a new line of research has emerged that focuses on developing interpretable machine learning methods and helping end-users make informed decisions. Despite the growing body of research in developing interpretable models, there is still no consensus on the definition and quantification of interpretability. We argue that to understand interpretability, we need to bring humans in the loop and run human-subject experiments to understand the effect of interpretability on human behavior. This thesis approaches the problem of interpretability from an interdisciplinary perspective which builds on decades of research in psychology, cognitive science, and social science to understand human behavior and trust. Through controlled user experiments, we manipulate various design factors in supervised models that are commonly thought to make models more or less interpretable and measure their influence on user behavior, performance, and trust. Additionally, we develop interpretable and interactive machine learning based systems that exploit unsupervised machine learning models to bring humans in the loop and help them in completing real-world tasks. By bringing humans and machines together, we can empower humans to understand and organize large document collections better and faster. Our findings and insights from these experiments can guide the development of next-generation machine learning models that can be used effectively and trusted by humans