1,344 research outputs found
Multilingual Models for Compositional Distributed Semantics
We present a novel technique for learning semantic representations, which
extends the distributional hypothesis to multilingual data and joint-space
embeddings. Our models leverage parallel data and learn to strongly align the
embeddings of semantically equivalent sentences, while maintaining sufficient
distance between those of dissimilar sentences. The models do not rely on word
alignments or any syntactic information and are successfully applied to a
number of diverse languages. We extend our approach to learn semantic
representations at the document level, too. We evaluate these models on two
cross-lingual document classification tasks, outperforming the prior state of
the art. Through qualitative analysis and the study of pivoting effects we
demonstrate that our representations are semantically plausible and can capture
semantic relationships across languages without parallel data.Comment: Proceedings of ACL 2014 (Long papers
Cross-lingual Argumentation Mining: Machine Translation (and a bit of Projection) is All You Need!
Argumentation mining (AM) requires the identification of complex discourse
structures and has lately been applied with success monolingually. In this
work, we show that the existing resources are, however, not adequate for
assessing cross-lingual AM, due to their heterogeneity or lack of complexity.
We therefore create suitable parallel corpora by (human and machine)
translating a popular AM dataset consisting of persuasive student essays into
German, French, Spanish, and Chinese. We then compare (i) annotation projection
and (ii) bilingual word embeddings based direct transfer strategies for
cross-lingual AM, finding that the former performs considerably better and
almost eliminates the loss from cross-lingual transfer. Moreover, we find that
annotation projection works equally well when using either costly human or
cheap machine translations. Our code and data are available at
\url{http://github.com/UKPLab/coling2018-xling_argument_mining}.Comment: Accepted at Coling 201
- …