5,129 research outputs found
Crowdsourcing Argumentation Structures in Chinese Hotel Reviews
Argumentation mining aims at automatically extracting the premises-claim
discourse structures in natural language texts. There is a great demand for
argumentation corpora for customer reviews. However, due to the controversial
nature of the argumentation annotation task, there exist very few large-scale
argumentation corpora for customer reviews. In this work, we novelly use the
crowdsourcing technique to collect argumentation annotations in Chinese hotel
reviews. As the first Chinese argumentation dataset, our corpus includes 4814
argument component annotations and 411 argument relation annotations, and its
annotations qualities are comparable to some widely used argumentation corpora
in other languages.Comment: 6 pages,3 figures,This article has been submitted to "The 2017 IEEE
International Conference on Systems, Man, and Cybernetics (SMC2017)
Mining Legal Arguments in Court Decisions
Identifying, classifying, and analyzing arguments in legal discourse has been
a prominent area of research since the inception of the argument mining field.
However, there has been a major discrepancy between the way natural language
processing (NLP) researchers model and annotate arguments in court decisions
and the way legal experts understand and analyze legal argumentation. While
computational approaches typically simplify arguments into generic premises and
claims, arguments in legal research usually exhibit a rich typology that is
important for gaining insights into the particular case and applications of law
in general. We address this problem and make several substantial contributions
to move the field forward. First, we design a new annotation scheme for legal
arguments in proceedings of the European Court of Human Rights (ECHR) that is
deeply rooted in the theory and practice of legal argumentation research.
Second, we compile and annotate a large corpus of 373 court decisions (2.3M
tokens and 15k annotated argument spans). Finally, we train an argument mining
model that outperforms state-of-the-art models in the legal NLP domain and
provide a thorough expert-based evaluation. All datasets and source codes are
available under open lincenses at
https://github.com/trusthlt/mining-legal-arguments.Comment: to appear in Artificial Intelligence and La
TEI and LMF crosswalks
The present paper explores various arguments in favour of making the Text
Encoding Initia-tive (TEI) guidelines an appropriate serialisation for ISO
standard 24613:2008 (LMF, Lexi-cal Mark-up Framework) . It also identifies the
issues that would have to be resolved in order to reach an appropriate
implementation of these ideas, in particular in terms of infor-mational
coverage. We show how the customisation facilities offered by the TEI
guidelines can provide an adequate background, not only to cover missing
components within the current Dictionary chapter of the TEI guidelines, but
also to allow specific lexical projects to deal with local constraints. We
expect this proposal to be a basis for a future ISO project in the context of
the on going revision of LMF
Recommended from our members
Building a corpus of legal argumentation in Japanese judgement documents: towards structure-based summarisation
Learning Sentence-internal Temporal Relations
In this paper we propose a data intensive approach for inferring
sentence-internal temporal relations. Temporal inference is relevant for
practical NLP applications which either extract or synthesize temporal
information (e.g., summarisation, question answering). Our method bypasses the
need for manual coding by exploiting the presence of markers like after", which
overtly signal a temporal relation. We first show that models trained on main
and subordinate clauses connected with a temporal marker achieve good
performance on a pseudo-disambiguation task simulating temporal inference
(during testing the temporal marker is treated as unseen and the models must
select the right marker from a set of possible candidates). Secondly, we assess
whether the proposed approach holds promise for the semi-automatic creation of
temporal annotations. Specifically, we use a model trained on noisy and
approximate data (i.e., main and subordinate clauses) to predict
intra-sentential relations present in TimeBank, a corpus annotated rich
temporal information. Our experiments compare and contrast several
probabilistic models differing in their feature space, linguistic assumptions
and data requirements. We evaluate performance against gold standard corpora
and also against human subjects
An empirical evaluation of AMR parsing for legal documents
Many approaches have been proposed to tackle the problem of Abstract Meaning
Representation (AMR) parsing, helps solving various natural language processing
issues recently. In our paper, we provide an overview of different methods in
AMR parsing and their performances when analyzing legal documents. We conduct
experiments of different AMR parsers on our annotated dataset extracted from
the English version of Japanese Civil Code. Our results show the limitations as
well as open a room for improvements of current parsing techniques when
applying in this complicated domain
- …