Search CORE

41 research outputs found

German Perception Verbs: Automatic Classification of Prototypical and Multiple Non-literal Meanings

Author: David Benjamin
Schulte im Walde Sabine
Springorum Sylvia
Publication venue
Publication date: 23/10/2014
Field of study

This paper presents a token-based automatic classification of German perception verbs into literal vs. multiple non-literal senses. Based on a corpus-based dataset of German perception verbs and their systematic meaning shifts, we identify one verb of each of the four perception classes optical, acoustic, olfactory, haptic, and use Decision Trees relying on syntactic and semantic corpus-based features to classify the verb uses into 3-4 senses each. Our classifier reaches accuracies between 45.5% and 69.4%, in comparison to baselines between 27.5% and 39.0%. In three out of four cases analyzed our classifier’s accuracy is significantly higher than the according baseline

University of Hildesheim

German compound splitting using the compound productivity of morphemes

Author: Sugisaki Kyoko
Tuggener Don
Publication venue: Austrian Academy of Sciences Press
Publication date: 01/01/2018
Field of study

In this work, we present a novel compound splitting method for German by capturing the compound productivity of morphemes. We use a giga web corpus to create a lexicon and decompose noun compounds by computing the probabilities of compound elements as bound and free morphemes. Furthermore, we provide a uniformed evaluation of several unsupervised approaches and morphological analysers for the task. Our method achieved a high F1 score of 0.92, which was a comparable result to state-of-the-art methods

CUNI System for the WMT17 Multimodal Translation Task

Author: Helcl Jindřich
Libovický Jindřich
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

In this paper, we describe our submissions to the WMT17 Multimodal Translation Task. For Task 1 (multimodal translation), our best scoring system is a purely textual neural translation of the source image caption to the target language. The main feature of the system is the use of additional data that was acquired by selecting similar sentences from parallel corpora and by data synthesis with back-translation. For Task 2 (cross-lingual image captioning), our best submitted system generates an English caption which is then translated by the best system used in Task 1. We also present negative results, which are based on ideas that we believe have potential of making improvements, but did not prove to be useful in our particular setup.Comment: 8 pages; Camera-ready submission to WMT1

arXiv.org e-Print Archive

Biblio at Institute of Formal and Applied Linguistics

Alternative Solutions to a Language Design Problem: The Role of Adjectives and Gender Marking in Efficient Communication

Author: Dye M.
Futrell R.
Milin P.
Ramscar M.
Publication venue: 'Wiley'
Publication date: 08/12/2017
Field of study

A central goal of typological research is to characterize linguistic features in terms of both their functional role and their fit to social and cognitive systems. One long-standing puzzle concerns why certain languages employ grammatical gender. In an information theoretic analysis of German noun classification, Dye, Milin, Futrell, and Ramscar (2017) enumerated a number of important processing advantages gender confers. Yet this raises a further puzzle: If gender systems are so beneficial to processing, what does this mean for languages that make do without them? Here, we compare the communicative function of gender marking in German (a deterministic system) to that of prenominal adjectives in English (a probabilistic one), finding that despite their differences, both systems act to efficiently smooth information over discourse, making nouns more equally predictable in context. We examine why evolutionary pressures may favor one system over another and discuss the implications for compositional accounts of meaning and Gricean principles of communication

Publikationsserver der Universität Tübingen

Neural reranking for dependency parsing: An evaluation

Author: Do Bich-Ngoc
Rehbein Ines
Publication venue: Association for Computational Linguistics, ACL
Publication date: 01/01/2020
Field of study

Recent work has shown that neural rerankers can improve results for dependency parsing over the top k trees produced by a base parser. However, all neural rerankers so far have been evaluated on English and Chinese only, both languages with a configurational word order and poor morphology. In the paper, we re-assess the potential of successful neural reranking models from the literature on English and on two morphologically rich(er) languages, German and Czech. In addition, we introduce a new variation of a discriminative reranker based on graph convolutional networks (GCNs). We show that the GCN not only outperforms previous models on English but is the only model that is able to improve results over the baselines on German and Czech. We explain the differences in reranking performance based on an analysis of a) the gold tree ratio and b) the variety in the k-best lists

MAnnheim DOCument Server

Improving Word Sense Disambiguation in Neural Machine Translation with Sense Embeddings

Author: Mascarell Laura
Rios Annette
Sennrich Rico
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

A functional theory of gender paradigms

Author: Dye M.
Futrell R.
Milin P.
Ramscar M.
Publication venue: 'Brill'
Publication date: 01/01/2017
Field of study