research

German compound splitting using the compound productivity of morphemes

Abstract

In this work, we present a novel compound splitting method for German by capturing the compound productivity of morphemes. We use a giga web corpus to create a lexicon and decompose noun compounds by computing the probabilities of compound elements as bound and free morphemes. Furthermore, we provide a uniformed evaluation of several unsupervised approaches and morphological analysers for the task. Our method achieved a high F1 score of 0.92, which was a comparable result to state-of-the-art methods

    Similar works