5,129 research outputs found

    Crowdsourcing Argumentation Structures in Chinese Hotel Reviews

    Full text link
    Argumentation mining aims at automatically extracting the premises-claim discourse structures in natural language texts. There is a great demand for argumentation corpora for customer reviews. However, due to the controversial nature of the argumentation annotation task, there exist very few large-scale argumentation corpora for customer reviews. In this work, we novelly use the crowdsourcing technique to collect argumentation annotations in Chinese hotel reviews. As the first Chinese argumentation dataset, our corpus includes 4814 argument component annotations and 411 argument relation annotations, and its annotations qualities are comparable to some widely used argumentation corpora in other languages.Comment: 6 pages,3 figures,This article has been submitted to "The 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC2017)

    Mining Legal Arguments in Court Decisions

    Get PDF
    Identifying, classifying, and analyzing arguments in legal discourse has been a prominent area of research since the inception of the argument mining field. However, there has been a major discrepancy between the way natural language processing (NLP) researchers model and annotate arguments in court decisions and the way legal experts understand and analyze legal argumentation. While computational approaches typically simplify arguments into generic premises and claims, arguments in legal research usually exhibit a rich typology that is important for gaining insights into the particular case and applications of law in general. We address this problem and make several substantial contributions to move the field forward. First, we design a new annotation scheme for legal arguments in proceedings of the European Court of Human Rights (ECHR) that is deeply rooted in the theory and practice of legal argumentation research. Second, we compile and annotate a large corpus of 373 court decisions (2.3M tokens and 15k annotated argument spans). Finally, we train an argument mining model that outperforms state-of-the-art models in the legal NLP domain and provide a thorough expert-based evaluation. All datasets and source codes are available under open lincenses at https://github.com/trusthlt/mining-legal-arguments.Comment: to appear in Artificial Intelligence and La

    TEI and LMF crosswalks

    Get PDF
    The present paper explores various arguments in favour of making the Text Encoding Initia-tive (TEI) guidelines an appropriate serialisation for ISO standard 24613:2008 (LMF, Lexi-cal Mark-up Framework) . It also identifies the issues that would have to be resolved in order to reach an appropriate implementation of these ideas, in particular in terms of infor-mational coverage. We show how the customisation facilities offered by the TEI guidelines can provide an adequate background, not only to cover missing components within the current Dictionary chapter of the TEI guidelines, but also to allow specific lexical projects to deal with local constraints. We expect this proposal to be a basis for a future ISO project in the context of the on going revision of LMF

    Learning Sentence-internal Temporal Relations

    Get PDF
    In this paper we propose a data intensive approach for inferring sentence-internal temporal relations. Temporal inference is relevant for practical NLP applications which either extract or synthesize temporal information (e.g., summarisation, question answering). Our method bypasses the need for manual coding by exploiting the presence of markers like after", which overtly signal a temporal relation. We first show that models trained on main and subordinate clauses connected with a temporal marker achieve good performance on a pseudo-disambiguation task simulating temporal inference (during testing the temporal marker is treated as unseen and the models must select the right marker from a set of possible candidates). Secondly, we assess whether the proposed approach holds promise for the semi-automatic creation of temporal annotations. Specifically, we use a model trained on noisy and approximate data (i.e., main and subordinate clauses) to predict intra-sentential relations present in TimeBank, a corpus annotated rich temporal information. Our experiments compare and contrast several probabilistic models differing in their feature space, linguistic assumptions and data requirements. We evaluate performance against gold standard corpora and also against human subjects

    An empirical evaluation of AMR parsing for legal documents

    Full text link
    Many approaches have been proposed to tackle the problem of Abstract Meaning Representation (AMR) parsing, helps solving various natural language processing issues recently. In our paper, we provide an overview of different methods in AMR parsing and their performances when analyzing legal documents. We conduct experiments of different AMR parsers on our annotated dataset extracted from the English version of Japanese Civil Code. Our results show the limitations as well as open a room for improvements of current parsing techniques when applying in this complicated domain

    Thirty years of artificial intelligence and law : the third decade

    Get PDF
    corecore