7,552 research outputs found
Crowdsourcing Argumentation Structures in Chinese Hotel Reviews
Argumentation mining aims at automatically extracting the premises-claim
discourse structures in natural language texts. There is a great demand for
argumentation corpora for customer reviews. However, due to the controversial
nature of the argumentation annotation task, there exist very few large-scale
argumentation corpora for customer reviews. In this work, we novelly use the
crowdsourcing technique to collect argumentation annotations in Chinese hotel
reviews. As the first Chinese argumentation dataset, our corpus includes 4814
argument component annotations and 411 argument relation annotations, and its
annotations qualities are comparable to some widely used argumentation corpora
in other languages.Comment: 6 pages,3 figures,This article has been submitted to "The 2017 IEEE
International Conference on Systems, Man, and Cybernetics (SMC2017)
Disagreement dissected : vagueness as a source of ambiguity in nominal (co-)reference
Using a qualitative analysis of disagreements from a referentially annotated newspaper corpus, we show that, in coreference annotation, vague referents are prone to greater disagreement. We show how potentially problematic cases can be dealt with in a way that is practical even for larger-scale annotation, considering a real-world example from newspaper text
Building a resource for studying translation shifts
This paper describes an interdisciplinary approach which brings together the
fields of corpus linguistics and translation studies. It presents ongoing work
on the creation of a corpus resource in which translation shifts are explicitly
annotated. Translation shifts denote departures from formal correspondence
between source and target text, i.e. deviations that have occurred during the
translation process. A resource in which such shifts are annotated in a
systematic way will make it possible to study those phenomena that need to be
addressed if machine translation output is to resemble human translation. The
resource described in this paper contains English source texts (parliamentary
proceedings) and their German translations. The shift annotation is based on
predicate-argument structures and proceeds in two steps: first, predicates and
their arguments are annotated monolingually in a straightforward manner. Then,
the corresponding English and German predicates and arguments are aligned with
each other. Whenever a shift - mainly grammatical or semantic -has occurred,
the alignment is tagged accordingly.Comment: 6 pages, 1 figur
Exploring lexical patterns in text : lexical cohesion analysis with WordNet
We present a system for the linguistic exploration and analysis of lexical cohesion in English texts. Using an electronic thesaurus-like resource, Princeton WordNet, and the Brown Corpus of English, we have implemented a process of annotating text with lexical chains and a graphical user interface for inspection of the annotated text. We describe the system and report on some sample linguistic analyses carried out using the combined thesaurus-corpus resource
- …