Search CORE

2,911 research outputs found

Document controversy classification based on the Wikipedia category structure

Author: Jankowski-Lorek Michał
Zieliński Kazimierz
Publication venue: 'AGHU University of Science and Technology Press'
Publication date: 07/09/2015
Field of study

Dispute and controversy are parts of our culture and cannot be omitted on the Internet (where it becomes more anonymous). There have been many studies on controversy, especially on social networks such as Wikipedia. This free on-line encyclopedia has become a very popular data source among many researchers studying behavior or natural language processing. This paper presents using the category structure of Wikipedia to determine the controversy of a single article. This is the first part of the proposed system for classification of topic controversy score for any given text

Computer Science Journal (AGH University of Science and Technology, Krakow)

Joint RNN Model for Argument Component Boundary Detection

Author: Du Yang
Gao Yang
Li Minglan
Liu Haijing
Wang Hao
Wen Hui
Publication venue
Publication date: 05/05/2017
Field of study

Argument Component Boundary Detection (ACBD) is an important sub-task in argumentation mining; it aims at identifying the word sequences that constitute argument components, and is usually considered as the first sub-task in the argumentation mining pipeline. Existing ACBD methods heavily depend on task-specific knowledge, and require considerable human efforts on feature-engineering. To tackle these problems, in this work, we formulate ACBD as a sequence labeling problem and propose a variety of Recurrent Neural Network (RNN) based methods, which do not use domain specific or handcrafted features beyond the relative position of the sentence in the document. In particular, we propose a novel joint RNN model that can predict whether sentences are argumentative or not, and use the predicted results to more precisely detect the argument component boundaries. We evaluate our techniques on two corpora from two different genres; results suggest that our joint RNN model obtain the state-of-the-art performance on both datasets.Comment: 6 pages, 3 figures, submitted to IEEE SMC 201

arXiv.org e-Print Archive

Crossref

MythQA: Query-Based Large-Scale Check-Worthy Claim Detection through Multi-Answer Open-Domain Question Answering

Author: Bai Yang
Colas Anthony
Wang Daisy Zhe
Publication venue
Publication date: 21/07/2023
Field of study

Check-worthy claim detection aims at providing plausible misinformation to downstream fact-checking systems or human experts to check. This is a crucial step toward accelerating the fact-checking process. Many efforts have been put into how to identify check-worthy claims from a small scale of pre-collected claims, but how to efficiently detect check-worthy claims directly from a large-scale information source, such as Twitter, remains underexplored. To fill this gap, we introduce MythQA, a new multi-answer open-domain question answering(QA) task that involves contradictory stance mining for query-based large-scale check-worthy claim detection. The idea behind this is that contradictory claims are a strong indicator of misinformation that merits scrutiny by the appropriate authorities. To study this task, we construct TweetMythQA, an evaluation dataset containing 522 factoid multi-answer questions based on controversial topics. Each question is annotated with multiple answers. Moreover, we collect relevant tweets for each distinct answer, then classify them into three categories: "Supporting", "Refuting", and "Neutral". In total, we annotated 5.3K tweets. Contradictory evidence is collected for all answers in the dataset. Finally, we present a baseline system for MythQA and evaluate existing NLP models for each system component using the TweetMythQA dataset. We provide initial benchmarks and identify key challenges for future models to improve upon. Code and data are available at: https://github.com/TonyBY/Myth-QAComment: Accepted by SIGIR 202

arXiv.org e-Print Archive

Recommended from our members

Controversy Analysis and Detection

Author: Dori-Hacohen Shiri
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/11/2017
Field of study

Seeking information on a controversial topic is often a complex task. Alerting users about controversial search results can encourage critical literacy, promote healthy civic discourse and counteract the filter bubble effect, and therefore would be a useful feature in a search engine or browser extension. Additionally, presenting information to the user about the different stances or sides of the debate can help her navigate the landscape of search results beyond a simple list of 10 links . This thesis has made strides in the emerging niche of controversy detection and analysis. The body of work in this thesis revolves around two themes: computational models of controversy, and controversies occurring in neighborhoods of topics. Our broad contributions are: (1) Presenting a theoretical framework for modeling controversy as contention among populations; (2) Constructing the first automated approach to detecting controversy on the web, using a KNN classifier that maps from the web to similar Wikipedia articles; and (3) Proposing a novel controversy detection in Wikipedia by employing a stacked model using a combination of link structure and similarity. We conclude this work by discussing the challenging technical, societal and ethical implications of this emerging research area and proposing avenues for future work

ScholarWorks@UMass Amherst

Corpus Wide Argument Mining -- a Working Solution

Author: Aharonov Ranit
Alzate Carlos
Bilu Yonatan
Choshen Leshem
Dankin Lena
Ein-Dor Liat
Gera Ariel
Gleize Martin
Halfon Alon
Hou Yufang
Shnarch Eyal
Slonim Noam
Sznajder Benjamin
Publication venue
Publication date: 25/11/2019
Field of study

One of the main tasks in argument mining is the retrieval of argumentative content pertaining to a given topic. Most previous work addressed this task by retrieving a relatively small number of relevant documents as the initial source for such content. This line of research yielded moderate success, which is of limited use in a real-world system. Furthermore, for such a system to yield a comprehensive set of relevant arguments, over a wide range of topics, it requires leveraging a large and diverse corpus in an appropriate manner. Here we present a first end-to-end high-precision, corpus-wide argument mining system. This is made possible by combining sentence-level queries over an appropriate indexing of a very large corpus of newspaper articles, with an iterative annotation scheme. This scheme addresses the inherent label bias in the data and pinpoints the regions of the sample space whose manual labeling is required to obtain high-precision among top-ranked candidates

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

A Wikipedia Literature Review

Author: Martin Owen S.
Publication venue
Publication date: 01/01/2010
Field of study

This paper was originally designed as a literature review for a doctoral dissertation focusing on Wikipedia. This exposition gives the structure of Wikipedia and the latest trends in Wikipedia research

arXiv.org e-Print Archive

CiteSeerX