57 research outputs found

    Understanding misinformation on Twitter in the context of controversial issues

    Get PDF
    Social media is slowly supplementing, or even replacing, traditional media outlets such as television, newspapers, and radio. However, social media presents some drawbacks when it comes to circulating information. These drawbacks include spreading false information, rumors, and fake news. At least three main factors create these drawbacks: The filter bubble effect, misinformation, and information overload. These factors make gathering accurate and credible information online very challenging, which in turn may affect public trust in online information. These issues are even more challenging when the issue under discussion is a controversial topic. In this thesis, four main controversial topics are studied, each of which comes from a different domain. This variation of domains can give a broad view of how misinformation is manifested in social media, and how it is manifested differently in different domains. This thesis aims to understand misinformation in the context of controversial issue discussions. This can be done through understanding how misinformation is manifested in social media as well as by understanding people’s opinions towards these controversial issues. In this thesis, three different aspects of a tweet are studied. These aspects are 1) the user sharing the information, 2) the information source shared, and 3) whether specific linguistic cues can help in assessing the credibility of information on social media. Finally, the web application tool TweetChecker is used to allow online users to have a more in-depth understanding of the discussions about five different controversial health issues. The results and recommendations of this study can be used to build solutions for the problem of trustworthiness of user-generated content on different social media platforms, especially for controversial issues

    A Clustering Algorithm for Early Prediction of Controversial Reddit Posts

    Get PDF
    Social curation platforms like Reddit are rich with user interactions such as comments, upvotes, and downvotes. Predicting these interactions before they happen is an interesting computational challenge and can be used for a variety of tasks, ranging from content moderation to personality prediction. Given the vast amount of information posted on these sites, it\u27s important to develop models that can simplify this prediction task. In this paper, we present a simple clustering algorithm that helps predict the controversiality of a Reddit post using the user\u27s profile information, their past contributions on Reddit, and the sentiment expressed in their post. On average, introducing the cluster to the prediction task improved the accuracy of the prediction by over 20 percent, with F1 scores of 0.95 (micro) and 0.7 (macro). The classifier performs better than a majority predictor. The results also show that the overwhelming majority of users are inactive and when they do post, they post non-controversial content

    Capturing stance dynamics in social media: open challenges and research directions

    Get PDF
    Social media platforms provide a goldmine for mining public opinion on issues of wide societal interest and impact. Opinion mining is a problem that can be operationalised by capturing and aggregating the stance of individual social media posts as supporting, opposing or being neutral towards the issue at hand. While most prior work in stance detection has investigated datasets that cover short periods of time, interest in investigating longitudinal datasets has recently increased. Evolving dynamics in linguistic and behavioural patterns observed in new data require adapting stance detection systems to deal with the changes. In this survey paper, we investigate the intersection between computational linguistics and the temporal evolution of human communication in digital media. We perform a critical review of emerging research considering dynamics, exploring different semantic and pragmatic factors that impact linguistic data in general, and stance in particular. We further discuss current directions in capturing stance dynamics in social media. We discuss the challenges encountered when dealing with stance dynamics, identify open challenges and discuss future directions in three key dimensions: utterance, context and influence

    Adult readers evaluating the credibility of social media posts: Prior belief consistency and source's expertise matter most

    Full text link
    The present study investigates the role of source characteristics, the quality of evidence, and prior beliefs of the topic in adult readers' credibility evaluations of short health-related social media posts. The researchers designed content for the posts concerning five health topics by manipulating the source characteristics (source's expertise, gender, and ethnicity), the accuracy of the claims, and the quality of evidence (research evidence, testimony, consensus, and personal experience) of the posts. After this, accurate and inaccurate social media posts varying in the other manipulated aspects were programmatically generated. The crowdworkers (N = 844) recruited from two platforms were asked to evaluate the credibility of up to ten social media posts, resulting in 8380 evaluations. Before credibility evaluation, participants' prior beliefs on the topics of the posts were assessed. The results showed that prior belief consistency and the source's expertise affected the perceived credibility of the accurate and inaccurate social media posts the most after controlling for the topic of the post and the crowdworking platform. In contrast, the quality of evidence supporting the health claim mattered relatively little. The source's gender and ethnicity did not have any effect. The results are discussed in terms of first- and second-hand evaluation strategies.Comment: 16 pages, 4 figures including the appendix. Submitted to a journal for peer revie

    Stance classification of Twitter debates: The encryption debate as a use case

    Get PDF
    Social media have enabled a revolution in user-generated content. They allow users to connect, build community, produce and share content, and publish opinions. To better understand online users’ attitudes and opinions, we use stance classification. Stance classification is a relatively new and challenging approach to deepen opinion mining by classifying a user's stance in a debate. Our stance classification use case is tweets that were related to the spring 2016 debate over the FBI’s request that Apple decrypt a user’s iPhone. In this “encryption debate,” public opinion was polarized between advocates for individual privacy and advocates for national security. We propose a machine learning approach to classify stance in the debate, and a topic classification that uses lexical, syntactic, Twitter-specific, and argumentative features as a predictor for classifications. Models trained on these feature sets showed significant increases in accuracy relative to the unigram baseline.Ope

    Sketching the vision of the Web of Debates

    Get PDF
    The exchange of comments, opinions, and arguments in blogs, forums, social media, wikis, and review websites has transformed the Web into a modern agora, a virtual place where all types of debates take place. This wealth of information remains mostly unexploited: due to its textual form, such information is difficult to automatically process and analyse in order to validate, evaluate, compare, combine with other types of information and make it actionable. Recent research in Machine Learning, Natural Language Processing, and Computational Argumentation has provided some solutions, which still cannot fully capture important aspects of online debates, such as various forms of unsound reasoning, arguments that do not follow a standard structure, information that is not explicitly expressed, and non-logical argumentation methods. Tackling these challenges would give immense added-value, as it would allow searching for, navigating through and analyzing online opinions and arguments, obtaining a better picture of the various debates for a well-intentioned user. Ultimately, it may lead to increased participation of Web users in democratic, dialogical interchange of arguments, more informed decisions by professionals and decision-makers, as well as to an easier identification of biased, misleading, or deceptive arguments. This paper presents the vision of the Web of Debates, a more human-centered version of the Web, which aims to unlock the potential of the abundance of argumentative information that currently exists online, offering its users a new generation of argument-based web services and tools that are tailored to their real needs

    Credibility analysis of textual claims with explainable evidence

    Get PDF
    Despite being a vast resource of valuable information, the Web has been polluted by the spread of false claims. Increasing hoaxes, fake news, and misleading information on the Web have given rise to many fact-checking websites that manually assess these doubtful claims. However, the rapid speed and large scale of misinformation spread have become the bottleneck for manual verification. This calls for credibility assessment tools that can automate this verification process. Prior works in this domain make strong assumptions about the structure of the claims and the communities where they are made. Most importantly, black-box techniques proposed in prior works lack the ability to explain why a certain statement is deemed credible or not. To address these limitations, this dissertation proposes a general framework for automated credibility assessment that does not make any assumption about the structure or origin of the claims. Specifically, we propose a feature-based model, which automatically retrieves relevant articles about the given claim and assesses its credibility by capturing the mutual interaction between the language style of the relevant articles, their stance towards the claim, and the trustworthiness of the underlying web sources. We further enhance our credibility assessment approach and propose a neural-network-based model. Unlike the feature-based model, this model does not rely on feature engineering and external lexicons. Both our models make their assessments interpretable by extracting explainable evidence from judiciously selected web sources. We utilize our models and develop a Web interface, CredEye, which enables users to automatically assess the credibility of a textual claim and dissect into the assessment by browsing through judiciously and automatically selected evidence snippets. In addition, we study the problem of stance classification and propose a neural-network-based model for predicting the stance of diverse user perspectives regarding the controversial claims. Given a controversial claim and a user comment, our stance classification model predicts whether the user comment is supporting or opposing the claim.Das Web ist eine riesige Quelle wertvoller Informationen, allerdings wurde es durch die Verbreitung von Falschmeldungen verschmutzt. Eine zunehmende Anzahl an Hoaxes, Falschmeldungen und irreführenden Informationen im Internet haben viele Websites hervorgebracht, auf denen die Fakten überprüft und zweifelhafte Behauptungen manuell bewertet werden. Die rasante Verbreitung großer Mengen von Fehlinformationen sind jedoch zum Engpass für die manuelle Überprüfung geworden. Dies erfordert Tools zur Bewertung der Glaubwürdigkeit, mit denen dieser Überprüfungsprozess automatisiert werden kann. In früheren Arbeiten in diesem Bereich werden starke Annahmen gemacht über die Struktur der Behauptungen und die Portale, in denen sie gepostet werden. Vor allem aber können die Black-Box-Techniken, die in früheren Arbeiten vorgeschlagen wurden, nicht erklären, warum eine bestimmte Aussage als glaubwürdig erachtet wird oder nicht. Um diesen Einschränkungen zu begegnen, wird in dieser Dissertation ein allgemeines Framework für die automatisierte Bewertung der Glaubwürdigkeit vorgeschlagen, bei dem keine Annahmen über die Struktur oder den Ursprung der Behauptungen gemacht werden. Insbesondere schlagen wir ein featurebasiertes Modell vor, das automatisch relevante Artikel zu einer bestimmten Behauptung abruft und deren Glaubwürdigkeit bewertet, indem die gegenseitige Interaktion zwischen dem Sprachstil der relevanten Artikel, ihre Haltung zur Behauptung und der Vertrauenswürdigkeit der zugrunde liegenden Quellen erfasst wird. Wir verbessern unseren Ansatz zur Bewertung der Glaubwürdigkeit weiter und schlagen ein auf neuronalen Netzen basierendes Modell vor. Im Gegensatz zum featurebasierten Modell ist dieses Modell nicht auf Feature-Engineering und externe Lexika angewiesen. Unsere beiden Modelle machen ihre Einschätzungen interpretierbar, indem sie erklärbare Beweise aus sorgfältig ausgewählten Webquellen extrahieren. Wir verwenden unsere Modelle zur Entwicklung eines Webinterfaces, CredEye, mit dem Benutzer die Glaubwürdigkeit einer Behauptung in Textform automatisch bewerten und verstehen können, indem sie automatisch ausgewählte Beweisstücke einsehen. Darüber hinaus untersuchen wir das Problem der Positionsklassifizierung und schlagen ein auf neuronalen Netzen basierendes Modell vor, um die Position verschiedener Benutzerperspektiven in Bezug auf die umstrittenen Behauptungen vorherzusagen. Bei einer kontroversen Behauptung und einem Benutzerkommentar sagt unser Einstufungsmodell voraus, ob der Benutzerkommentar die Behauptung unterstützt oder ablehnt

    Annotating Student Talk in Text-based Classroom Discussions

    Full text link
    Classroom discussions in English Language Arts have a positive effect on students' reading, writing and reasoning skills. Although prior work has largely focused on teacher talk and student-teacher interactions, we focus on three theoretically-motivated aspects of high-quality student talk: argumentation, specificity, and knowledge domain. We introduce an annotation scheme, then show that the scheme can be used to produce reliable annotations and that the annotations are predictive of discussion quality. We also highlight opportunities provided by our scheme for education and natural language processing research
    corecore