1 research outputs found

    Compound Terms and Their Multi-word Variants: Case of German and Russian Languages

    No full text
    International audienceThe terminology of any language and any domain continuously evolves and leads to a constant term renewal. Terms undergo a wide range of morphological and syntactic variations which have to be handled by any NLP applications. If the syntactic variations of multi-word terms have been described and tools designed to process them, only a few works studied the syntagmatic variants of compound terms. This paper is dedicated to the identification of such variants, and more precisely to the detection of synonymic pairs that consist of “compound term - multi-word term”. We describe a pipeline for their detection, from compound recognition and splitting to alignment of the variants with original terms, through multi-word term extraction. The experiments are carried out for two compound-producing languages, German and Russian, and two specialised domains: wind energy and breast cancer. We identify variation patterns for these two languages and demonstrate that the transformation of a morphological compound into a syntagmatic compound mainly occurs when the term denomination needs to be enlarged
    corecore