10,142 research outputs found
Argumentation Mining in User-Generated Web Discourse
The goal of argumentation mining, an evolving research field in computational
linguistics, is to design methods capable of analyzing people's argumentation.
In this article, we go beyond the state of the art in several ways. (i) We deal
with actual Web data and take up the challenges given by the variety of
registers, multiple domains, and unrestricted noisy user-generated Web
discourse. (ii) We bridge the gap between normative argumentation theories and
argumentation phenomena encountered in actual data by adapting an argumentation
model tested in an extensive annotation study. (iii) We create a new gold
standard corpus (90k tokens in 340 documents) and experiment with several
machine learning methods to identify argument components. We offer the data,
source codes, and annotation guidelines to the community under free licenses.
Our findings show that argumentation mining in user-generated Web discourse is
a feasible but challenging task.Comment: Cite as: Habernal, I. & Gurevych, I. (2017). Argumentation Mining in
User-Generated Web Discourse. Computational Linguistics 43(1), pp. 125-17
Chandra News
The Chandra Newsletter contains articles about the CXC and the Chandra mission. The Chandra Newsletter appears once a year and is edited by Paul J. Green, with editorial assistance and layout by Evan Tingle. We welcome contributions from readers. Comments on the newsletter, or corrections and additions to the hardcopy mailing list should be sent to: [email protected]
Fractional norms and quasinorms do not help to overcome the curse of dimensionality
The curse of dimensionality causes the well-known and widely discussed
problems for machine learning methods. There is a hypothesis that using of the
Manhattan distance and even fractional quasinorms lp (for p less than 1) can
help to overcome the curse of dimensionality in classification problems. In
this study, we systematically test this hypothesis. We confirm that fractional
quasinorms have a greater relative contrast or coefficient of variation than
the Euclidean norm l2, but we also demonstrate that the distance concentration
shows qualitatively the same behaviour for all tested norms and quasinorms and
the difference between them decays as dimension tends to infinity. Estimation
of classification quality for kNN based on different norms and quasinorms shows
that a greater relative contrast does not mean better classifier performance
and the worst performance for different databases was shown by different norms
(quasinorms). A systematic comparison shows that the difference of the
performance of kNN based on lp for p=2, 1, and 0.5 is statistically
insignificant
- …