3,170 research outputs found
Argumentation Mining in User-Generated Web Discourse
The goal of argumentation mining, an evolving research field in computational
linguistics, is to design methods capable of analyzing people's argumentation.
In this article, we go beyond the state of the art in several ways. (i) We deal
with actual Web data and take up the challenges given by the variety of
registers, multiple domains, and unrestricted noisy user-generated Web
discourse. (ii) We bridge the gap between normative argumentation theories and
argumentation phenomena encountered in actual data by adapting an argumentation
model tested in an extensive annotation study. (iii) We create a new gold
standard corpus (90k tokens in 340 documents) and experiment with several
machine learning methods to identify argument components. We offer the data,
source codes, and annotation guidelines to the community under free licenses.
Our findings show that argumentation mining in user-generated Web discourse is
a feasible but challenging task.Comment: Cite as: Habernal, I. & Gurevych, I. (2017). Argumentation Mining in
User-Generated Web Discourse. Computational Linguistics 43(1), pp. 125-17
An Enquiry Meet for the Case: Decision Theory, Presumptions, and Evidentiary Burdens in Formulating Antitrust Legal Standards
Presumptions have an important role in antitrust jurisprudence. This article suggests that a careful formulation of the relevant presumptions and associated evidentiary rebuttal burdens can provide the “enquiry meet for the case” across a large array of narrow categories of conduct confronted in antitrust to create a type of “meta” rule of reason. The article begins this project by using decision theory to analyze the types and properties of antitrust presumptions and evidentiary rebuttal burdens and the relationship between them. Depending on the category of conduct and market structure conditions, antitrust presumptions lie along a continuum from conclusive (irrebuttable) anticompetitive, to rebuttable anticompetitive, to competitively neutral, and on to rebuttable procompetitive and conclusive (irrebuttable) procompetitive presumptions. A key source of these presumptions is the likely competitive effects inferred from market conditions. Other sources are policy-based -- deterrence policy concerns and overarching policies involving the goals and premises of antitrust jurisprudence. Rebuttal evidence can either undermine the facts on which the presumptions are based or can provide other evidence to offset the competitive effects likely implied by the presumption. The evidentiary burden to rebut a presumption depends on the strength of the presumption and the availability and reliability of further case-specific evidence. These twin determinants can be combined and understood through the lens of Bayesian decision theory to explain how “the quality of proof required should vary with the circumstances.” The stronger the presumption and less reliable the case-specific evidence in signaling whether the conduct is anticompetitive versus procompetitive, the more difficult it will be for the disfavored party to satisfy the evidentiary burden to rebut the presumption. The evidentiary rebuttal burden generally is a burden of production, but also can involve the burden of persuasion, as with the original Philadelphia National Bank structural presumption, or typical procompetitive presumptions. If a presumption is rebutted with sufficient offsetting evidence to avoid an initial judgment, the presumption generally continues to carry some weakened weight in the post-rebuttal phase of the decision process. That is, a thumb remains on the scale. However, if the presumption is undermined, it is discredited and it carries no weight in the post-rebuttal decision process. The article uses this methodology to analyze various antitrust presumptions. It also analyzes the, burden-shifting rule of reason and suggests that the elements should not be rigidly sequenced in the decision process. The article also begins the project of reviewing, revising and refining existing antitrust presumptions with proposed revisions and refinements in a number of areas. The article invites other commentators to join the project by criticizing these proposals and suggesting others. These presumptions then could be applied by appellate courts and relied upon by lower court, litigants and business planners
MReD: A Meta-Review Dataset for Structure-Controllable Text Generation
When directly using existing text generation datasets for controllable
generation, we are facing the problem of not having the domain knowledge and
thus the aspects that could be controlled are limited. A typical example is
when using CNN/Daily Mail dataset for controllable text summarization, there is
no guided information on the emphasis of summary sentences. A more useful text
generator should leverage both the input text and the control signal to guide
the generation, which can only be built with a deep understanding of the domain
knowledge. Motivated by this vision, our paper introduces a new text generation
dataset, named MReD. Our new dataset consists of 7,089 meta-reviews and all its
45k meta-review sentences are manually annotated with one of the 9 carefully
defined categories, including abstract, strength, decision, etc. We present
experimental results on start-of-the-art summarization models, and propose
methods for structure-controlled generation with both extractive and
abstractive models using our annotated data. By exploring various settings and
analyzing the model behavior with respect to the control signal, we demonstrate
the challenges of our proposed task and the values of our dataset MReD.
Meanwhile, MReD also allows us to have a better understanding of the
meta-review domain.Comment: 15 pages, 5 figures, accepted at ACL 202
MOPRD: A multidisciplinary open peer review dataset
Open peer review is a growing trend in academic publications. Public access
to peer review data can benefit both the academic and publishing communities.
It also serves as a great support to studies on review comment generation and
further to the realization of automated scholarly paper review. However, most
of the existing peer review datasets do not provide data that cover the whole
peer review process. Apart from this, their data are not diversified enough as
they are mainly collected from the field of computer science. These two
drawbacks of the currently available peer review datasets need to be addressed
to unlock more opportunities for related studies. In response to this problem,
we construct MOPRD, a multidisciplinary open peer review dataset. This dataset
consists of paper metadata, multiple version manuscripts, review comments,
meta-reviews, author's rebuttal letters, and editorial decisions. Moreover, we
design a modular guided review comment generation method based on MOPRD.
Experiments show that our method delivers better performance indicated by both
automatic metrics and human evaluation. We also explore other potential
applications of MOPRD, including meta-review generation, editorial decision
prediction, author rebuttal generation, and scientometric analysis. MOPRD is a
strong endorsement for further studies in peer review-related research and
other applications
AMIR: Automated MisInformation Rebuttal -- A COVID-19 Vaccination Datasets based Recommendation System
Misinformation has emerged as a major societal threat in recent years in
general; specifically in the context of the COVID-19 pandemic, it has wrecked
havoc, for instance, by fuelling vaccine hesitancy. Cost-effective, scalable
solutions for combating misinformation are the need of the hour. This work
explored how existing information obtained from social media and augmented with
more curated fact checked data repositories can be harnessed to facilitate
automated rebuttal of misinformation at scale. While the ideas herein can be
generalized and reapplied in the broader context of misinformation mitigation
using a multitude of information sources and catering to the spectrum of social
media platforms, this work serves as a proof of concept, and as such, it is
confined in its scope to only rebuttal of tweets, and in the specific context
of misinformation regarding COVID-19. It leverages two publicly available
datasets, viz. FaCov (fact-checked articles) and misleading (social media
Twitter) data on COVID-19 Vaccination
Does My Rebuttal Matter? Insights from a Major NLP Conference
Peer review is a core element of the scientific process, particularly in
conference-centered fields such as ML and NLP. However, only few studies have
evaluated its properties empirically. Aiming to fill this gap, we present a
corpus that contains over 4k reviews and 1.2k author responses from ACL-2018.
We quantitatively and qualitatively assess the corpus. This includes a pilot
study on paper weaknesses given by reviewers and on quality of author
responses. We then focus on the role of the rebuttal phase, and propose a novel
task to predict after-rebuttal (i.e., final) scores from initial reviews and
author responses. Although author responses do have a marginal (and
statistically significant) influence on the final scores, especially for
borderline papers, our results suggest that a reviewer's final score is largely
determined by her initial score and the distance to the other reviewers'
initial scores. In this context, we discuss the conformity bias inherent to
peer reviewing, a bias that has largely been overlooked in previous research.
We hope our analyses will help better assess the usefulness of the rebuttal
phase in NLP conferences.Comment: Accepted to NAACL-HLT 2019. Main paper plus supplementary materia
- …