361 research outputs found
Argumentation Mining in User-Generated Web Discourse
The goal of argumentation mining, an evolving research field in computational
linguistics, is to design methods capable of analyzing people's argumentation.
In this article, we go beyond the state of the art in several ways. (i) We deal
with actual Web data and take up the challenges given by the variety of
registers, multiple domains, and unrestricted noisy user-generated Web
discourse. (ii) We bridge the gap between normative argumentation theories and
argumentation phenomena encountered in actual data by adapting an argumentation
model tested in an extensive annotation study. (iii) We create a new gold
standard corpus (90k tokens in 340 documents) and experiment with several
machine learning methods to identify argument components. We offer the data,
source codes, and annotation guidelines to the community under free licenses.
Our findings show that argumentation mining in user-generated Web discourse is
a feasible but challenging task.Comment: Cite as: Habernal, I. & Gurevych, I. (2017). Argumentation Mining in
User-Generated Web Discourse. Computational Linguistics 43(1), pp. 125-17
Understanding misinformation on Twitter in the context of controversial issues
Social media is slowly supplementing, or even replacing, traditional media outlets such as television, newspapers, and radio. However, social media presents some drawbacks when it comes to circulating information. These drawbacks include spreading false information, rumors, and fake news. At least three main factors create these drawbacks: The filter bubble effect, misinformation, and information overload. These factors make gathering accurate and credible information online very challenging, which in turn may affect public trust in online information. These issues are even more challenging when the issue under discussion is a controversial topic. In this thesis, four main controversial topics are studied, each of which comes from a different domain. This variation of domains can give a broad view of how misinformation is manifested in social media, and how it is manifested differently in different domains.
This thesis aims to understand misinformation in the context of controversial issue discussions. This can be done through understanding how misinformation is manifested in social media as well as by understanding people’s opinions towards these controversial issues. In this thesis, three different aspects of a tweet are studied. These aspects are 1) the user sharing the information, 2) the information source shared, and 3) whether specific linguistic cues can help in assessing the credibility of information on social media. Finally, the web application tool TweetChecker is used to allow online users to have a more in-depth understanding of the discussions about five different controversial health issues. The results and recommendations of this study can be used to build solutions for the problem of trustworthiness of user-generated content on different social media platforms, especially for controversial issues
Detecting Stance on Covid-19 Vaccine in a Polarized Media
The growing polarization in the United States has been widely reported. Media coverage plays an important role in shaping public opinion and influences public debates on complex and unfamiliar topics. There are some benefits to individuals and society from political polarization and conflict between opposing viewpoints. However, recent research has primarily highlighted the negative consequences of polarization which reached an all-time high. One such topic is the Covid-19 vaccine which was developed in record time, and the public learned about its safety and possible risks through the media coverage.
In this capstone, we examine U.S. news media coverage on the Covid-19 vaccine topic as an illustration of a debate in a polarized environment through the stance in the media on vaccine safety. Specifically, we analyze opinion-framing in the Covid-19 vaccine debate as a way of attributing a statement or belief to someone else. We focus on self-affirming and opponent-doubting discourse and analyze if Left-leaning and Right-leaning media engage in self-affirming or opponent-doubting discourse. For example, a health expert would say that “The leading researchers agree that Covid-19 vaccines are safe and effective,” while a vaccine skeptic would say that “Mistaken researchers claim that Covid-19 vaccines are safe and effective”.
We introduce VacStance, a dataset of 2,000 stance-labeled Covid-19 vaccine sentences extracted from 169,432 sentences drawing from 15,750 news articles covering left-leaning and right-leaning media outlets. We run a trained BERT classifier to analyze aspects of argumentation, how the different sides of the vaccine debate represent their own and each other’s opinions. To the best of our knowledge, VacStance is the first data set of media Covid-19 vaccine stances. Our data set and model is made available in GitHub for future projects on Covid-19 vaccine opinion-framing and stance detection
Viewpoint Diversity in Search Results
Adverse phenomena such as the search engine manipulation effect (SEME), where web search users change their attitude on a topic following whatever most highly-ranked search results promote, represent crucial challenges for research and industry. However, the current lack of automatic methods to comprehensively measure or increase viewpoint diversity in search results complicates the understanding and mitigation of such effects. This paper proposes a viewpoint bias metric that evaluates the divergence from a pre-defined scenario of ideal viewpoint diversity considering two essential viewpoint dimensions (i.e., stance and logic of evaluation). In a case study, we apply this metric to actual search results and find considerable viewpoint bias in search results across queries, topics, and search engines that could lead to adverse effects such as SEME. We subsequently demonstrate that viewpoint diversity in search results can be dramatically increased using existing diversification algorithms. The methods proposed in this paper can assist researchers and practitioners in evaluating and improving viewpoint diversity in search results.</p
Assessing enactment of content regulation policies: A post hoc crowd-sourced audit of election misinformation on YouTube
With the 2022 US midterm elections approaching, conspiratorial claims about
the 2020 presidential elections continue to threaten users' trust in the
electoral process. To regulate election misinformation, YouTube introduced
policies to remove such content from its searches and recommendations. In this
paper, we conduct a 9-day crowd-sourced audit on YouTube to assess the extent
of enactment of such policies. We recruited 99 users who installed a browser
extension that enabled us to collect up-next recommendation trails and search
results for 45 videos and 88 search queries about the 2020 elections. We find
that YouTube's search results, irrespective of search query bias, contain more
videos that oppose rather than support election misinformation. However,
watching misinformative election videos still lead users to a small number of
misinformative videos in the up-next trails. Our results imply that while
YouTube largely seems successful in regulating election misinformation, there
is still room for improvement.Comment: 22 page
No NLP task should be an island: multi-disciplinarity for diversity in news recommender systems
NWO406.D1.19.073Algorithms and the Foundations of Software technolog
- …