22 research outputs found
First Steps Towards an Annotated Database of American English
This paper reports on one of the first steps in building a very large annotated database of American English. We present and discuss the results of an experiment comparing manual part-of-speech tagging with manual verification and correction of automatic stochastic tagging. The experiment shows that correcting is superior to tagging with respect to speed, consistency and accuracy
Deducing linguistic structure from the statistics of large corpora
Within the last two years, approaches using both stochastic and symbolic techniques have proved adequate to deduce lexical ambiguity resolution rules with less than 3-4 % error rate, when trained on moderat
Understanding the use of fauxtography on social media
Despite the influence that image-based communication has on
online discourse, the role played by images in disinformation
is still not well understood. In this paper, we present the first
large-scale study of fauxtography, analyzing the use of manipulated
or misleading images in news discussion on online communities.
First, we develop a computational pipeline geared to
detect fauxtography, and identify over 61k instances of fauxtography
discussed on Twitter, 4chan, and Reddit. Then, we
study how posting fauxtography affects engagement of posts
on social media, finding that posts containing it receive more
interactions in the form of re-shares, likes, and comments. Finally,
we show that fauxtography images are often turned into
memes by Web communities. Our findings show that effective
mitigation against disinformation need to take images into account,
and highlight a number of challenges in dealing with
image-based disinformation.Accepted manuscrip