57,191 research outputs found
Measuring the Effect of Discourse Structure on Sentiment Analysis
International audienceThe aim of this paper is twofold: measuring the effect of discourse structure when assessing the overall opinion of a document and analyzing to what extent these effects depend on the corpus genre. Using Segmented Discourse Representation Theory as our formal framework, we propose several strategies to compute the overall rating. Our results show that discourse-based strategies lead to better scores in terms of accuracy and Pearson’s correlation than state-of-the-art approaches
Parsing Argumentation Structures in Persuasive Essays
In this article, we present a novel approach for parsing argumentation
structures. We identify argument components using sequence labeling at the
token level and apply a new joint model for detecting argumentation structures.
The proposed model globally optimizes argument component types and
argumentative relations using integer linear programming. We show that our
model considerably improves the performance of base classifiers and
significantly outperforms challenging heuristic baselines. Moreover, we
introduce a novel corpus of persuasive essays annotated with argumentation
structures. We show that our annotation scheme and annotation guidelines
successfully guide human annotators to substantial agreement. This corpus and
the annotation guidelines are freely available for ensuring reproducibility and
to encourage future research in computational argumentation.Comment: Under review in Computational Linguistics. First submission: 26
October 2015. Revised submission: 15 July 201
QUOTUS: The Structure of Political Media Coverage as Revealed by Quoting Patterns
Given the extremely large pool of events and stories available, media outlets
need to focus on a subset of issues and aspects to convey to their audience.
Outlets are often accused of exhibiting a systematic bias in this selection
process, with different outlets portraying different versions of reality.
However, in the absence of objective measures and empirical evidence, the
direction and extent of systematicity remains widely disputed.
In this paper we propose a framework based on quoting patterns for
quantifying and characterizing the degree to which media outlets exhibit
systematic bias. We apply this framework to a massive dataset of news articles
spanning the six years of Obama's presidency and all of his speeches, and
reveal that a systematic pattern does indeed emerge from the outlet's quoting
behavior. Moreover, we show that this pattern can be successfully exploited in
an unsupervised prediction setting, to determine which new quotes an outlet
will select to broadcast. By encoding bias patterns in a low-rank space we
provide an analysis of the structure of political media coverage. This reveals
a latent media bias space that aligns surprisingly well with political ideology
and outlet type. A linguistic analysis exposes striking differences across
these latent dimensions, showing how the different types of media outlets
portray different realities even when reporting on the same events. For
example, outlets mapped to the mainstream conservative side of the latent space
focus on quotes that portray a presidential persona disproportionately
characterized by negativity.Comment: To appear in the Proceedings of WWW 2015. 11pp, 10 fig. Interactive
visualization, data, and other info available at
http://snap.stanford.edu/quotus
Computational Controversy
Climate change, vaccination, abortion, Trump: Many topics are surrounded by
fierce controversies. The nature of such heated debates and their elements have
been studied extensively in the social science literature. More recently,
various computational approaches to controversy analysis have appeared, using
new data sources such as Wikipedia, which help us now better understand these
phenomena. However, compared to what social sciences have discovered about such
debates, the existing computational approaches mostly focus on just a few of
the many important aspects around the concept of controversies. In order to
link the two strands, we provide and evaluate here a controversy model that is
both, rooted in the findings of the social science literature and at the same
time strongly linked to computational methods. We show how this model can lead
to computational controversy analytics that have full coverage over all the
crucial aspects that make up a controversy.Comment: In Proceedings of the 9th International Conference on Social
Informatics (SocInfo) 201
Crowdsourcing Argumentation Structures in Chinese Hotel Reviews
Argumentation mining aims at automatically extracting the premises-claim
discourse structures in natural language texts. There is a great demand for
argumentation corpora for customer reviews. However, due to the controversial
nature of the argumentation annotation task, there exist very few large-scale
argumentation corpora for customer reviews. In this work, we novelly use the
crowdsourcing technique to collect argumentation annotations in Chinese hotel
reviews. As the first Chinese argumentation dataset, our corpus includes 4814
argument component annotations and 411 argument relation annotations, and its
annotations qualities are comparable to some widely used argumentation corpora
in other languages.Comment: 6 pages,3 figures,This article has been submitted to "The 2017 IEEE
International Conference on Systems, Man, and Cybernetics (SMC2017)
- …