3,170 research outputs found

    Argumentation Mining in User-Generated Web Discourse

    Full text link
    The goal of argumentation mining, an evolving research field in computational linguistics, is to design methods capable of analyzing people's argumentation. In this article, we go beyond the state of the art in several ways. (i) We deal with actual Web data and take up the challenges given by the variety of registers, multiple domains, and unrestricted noisy user-generated Web discourse. (ii) We bridge the gap between normative argumentation theories and argumentation phenomena encountered in actual data by adapting an argumentation model tested in an extensive annotation study. (iii) We create a new gold standard corpus (90k tokens in 340 documents) and experiment with several machine learning methods to identify argument components. We offer the data, source codes, and annotation guidelines to the community under free licenses. Our findings show that argumentation mining in user-generated Web discourse is a feasible but challenging task.Comment: Cite as: Habernal, I. & Gurevych, I. (2017). Argumentation Mining in User-Generated Web Discourse. Computational Linguistics 43(1), pp. 125-17

    An Enquiry Meet for the Case: Decision Theory, Presumptions, and Evidentiary Burdens in Formulating Antitrust Legal Standards

    Get PDF
    Presumptions have an important role in antitrust jurisprudence. This article suggests that a careful formulation of the relevant presumptions and associated evidentiary rebuttal burdens can provide the “enquiry meet for the case” across a large array of narrow categories of conduct confronted in antitrust to create a type of “meta” rule of reason. The article begins this project by using decision theory to analyze the types and properties of antitrust presumptions and evidentiary rebuttal burdens and the relationship between them. Depending on the category of conduct and market structure conditions, antitrust presumptions lie along a continuum from conclusive (irrebuttable) anticompetitive, to rebuttable anticompetitive, to competitively neutral, and on to rebuttable procompetitive and conclusive (irrebuttable) procompetitive presumptions. A key source of these presumptions is the likely competitive effects inferred from market conditions. Other sources are policy-based -- deterrence policy concerns and overarching policies involving the goals and premises of antitrust jurisprudence. Rebuttal evidence can either undermine the facts on which the presumptions are based or can provide other evidence to offset the competitive effects likely implied by the presumption. The evidentiary burden to rebut a presumption depends on the strength of the presumption and the availability and reliability of further case-specific evidence. These twin determinants can be combined and understood through the lens of Bayesian decision theory to explain how “the quality of proof required should vary with the circumstances.” The stronger the presumption and less reliable the case-specific evidence in signaling whether the conduct is anticompetitive versus procompetitive, the more difficult it will be for the disfavored party to satisfy the evidentiary burden to rebut the presumption. The evidentiary rebuttal burden generally is a burden of production, but also can involve the burden of persuasion, as with the original Philadelphia National Bank structural presumption, or typical procompetitive presumptions. If a presumption is rebutted with sufficient offsetting evidence to avoid an initial judgment, the presumption generally continues to carry some weakened weight in the post-rebuttal phase of the decision process. That is, a thumb remains on the scale. However, if the presumption is undermined, it is discredited and it carries no weight in the post-rebuttal decision process. The article uses this methodology to analyze various antitrust presumptions. It also analyzes the, burden-shifting rule of reason and suggests that the elements should not be rigidly sequenced in the decision process. The article also begins the project of reviewing, revising and refining existing antitrust presumptions with proposed revisions and refinements in a number of areas. The article invites other commentators to join the project by criticizing these proposals and suggesting others. These presumptions then could be applied by appellate courts and relied upon by lower court, litigants and business planners

    MReD: A Meta-Review Dataset for Structure-Controllable Text Generation

    Full text link
    When directly using existing text generation datasets for controllable generation, we are facing the problem of not having the domain knowledge and thus the aspects that could be controlled are limited. A typical example is when using CNN/Daily Mail dataset for controllable text summarization, there is no guided information on the emphasis of summary sentences. A more useful text generator should leverage both the input text and the control signal to guide the generation, which can only be built with a deep understanding of the domain knowledge. Motivated by this vision, our paper introduces a new text generation dataset, named MReD. Our new dataset consists of 7,089 meta-reviews and all its 45k meta-review sentences are manually annotated with one of the 9 carefully defined categories, including abstract, strength, decision, etc. We present experimental results on start-of-the-art summarization models, and propose methods for structure-controlled generation with both extractive and abstractive models using our annotated data. By exploring various settings and analyzing the model behavior with respect to the control signal, we demonstrate the challenges of our proposed task and the values of our dataset MReD. Meanwhile, MReD also allows us to have a better understanding of the meta-review domain.Comment: 15 pages, 5 figures, accepted at ACL 202

    MOPRD: A multidisciplinary open peer review dataset

    Full text link
    Open peer review is a growing trend in academic publications. Public access to peer review data can benefit both the academic and publishing communities. It also serves as a great support to studies on review comment generation and further to the realization of automated scholarly paper review. However, most of the existing peer review datasets do not provide data that cover the whole peer review process. Apart from this, their data are not diversified enough as they are mainly collected from the field of computer science. These two drawbacks of the currently available peer review datasets need to be addressed to unlock more opportunities for related studies. In response to this problem, we construct MOPRD, a multidisciplinary open peer review dataset. This dataset consists of paper metadata, multiple version manuscripts, review comments, meta-reviews, author's rebuttal letters, and editorial decisions. Moreover, we design a modular guided review comment generation method based on MOPRD. Experiments show that our method delivers better performance indicated by both automatic metrics and human evaluation. We also explore other potential applications of MOPRD, including meta-review generation, editorial decision prediction, author rebuttal generation, and scientometric analysis. MOPRD is a strong endorsement for further studies in peer review-related research and other applications

    AMIR: Automated MisInformation Rebuttal -- A COVID-19 Vaccination Datasets based Recommendation System

    Full text link
    Misinformation has emerged as a major societal threat in recent years in general; specifically in the context of the COVID-19 pandemic, it has wrecked havoc, for instance, by fuelling vaccine hesitancy. Cost-effective, scalable solutions for combating misinformation are the need of the hour. This work explored how existing information obtained from social media and augmented with more curated fact checked data repositories can be harnessed to facilitate automated rebuttal of misinformation at scale. While the ideas herein can be generalized and reapplied in the broader context of misinformation mitigation using a multitude of information sources and catering to the spectrum of social media platforms, this work serves as a proof of concept, and as such, it is confined in its scope to only rebuttal of tweets, and in the specific context of misinformation regarding COVID-19. It leverages two publicly available datasets, viz. FaCov (fact-checked articles) and misleading (social media Twitter) data on COVID-19 Vaccination

    Does My Rebuttal Matter? Insights from a Major NLP Conference

    Full text link
    Peer review is a core element of the scientific process, particularly in conference-centered fields such as ML and NLP. However, only few studies have evaluated its properties empirically. Aiming to fill this gap, we present a corpus that contains over 4k reviews and 1.2k author responses from ACL-2018. We quantitatively and qualitatively assess the corpus. This includes a pilot study on paper weaknesses given by reviewers and on quality of author responses. We then focus on the role of the rebuttal phase, and propose a novel task to predict after-rebuttal (i.e., final) scores from initial reviews and author responses. Although author responses do have a marginal (and statistically significant) influence on the final scores, especially for borderline papers, our results suggest that a reviewer's final score is largely determined by her initial score and the distance to the other reviewers' initial scores. In this context, we discuss the conformity bias inherent to peer reviewing, a bias that has largely been overlooked in previous research. We hope our analyses will help better assess the usefulness of the rebuttal phase in NLP conferences.Comment: Accepted to NAACL-HLT 2019. Main paper plus supplementary materia
    corecore