226 research outputs found

    Cross-Lingual Dependency Parsing for Closely Related Languages - Helsinki's Submission to VarDial 2017

    Full text link
    This paper describes the submission from the University of Helsinki to the shared task on cross-lingual dependency parsing at VarDial 2017. We present work on annotation projection and treebank translation that gave good results for all three target languages in the test set. In particular, Slovak seems to work well with information coming from the Czech treebank, which is in line with related work. The attachment scores for cross-lingual models even surpass the fully supervised models trained on the target language treebank. Croatian is the most difficult language in the test set and the improvements over the baseline are rather modest. Norwegian works best with information coming from Swedish whereas Danish contributes surprisingly little

    A probabilistic model for guessing base forms of new words by analogy

    Get PDF
    Volume: 4919 Host publication title: Computational Linguistics and Intelligent Text Processing 9th International Conference, CICLing 2008, Haifa, Israel, February 17-23, 2008. ProceedingsPeer reviewe

    Corpus-based lexeme ranking for morphological guessers

    Get PDF
    Peer reviewe

    Corpus-based paradigm Selection for morphological entries

    Get PDF
    Volume: 4 Host publication title: Nealt Proceedings Series Vol. 4 Host publication sub-title: Proceedings of the 17th Nordic Conference of Computational Linguistics NODALIDA 2009Peer reviewe

    Morphosyntactic Linguistic Wavelets for Knowledge Management

    Get PDF

    Crowd-sourcing evaluation of automatically acquired, morphologically related word groupings

    Get PDF
    The automatic discovery and clustering of morphologically related words is an important problem with several practical applications. This paper describes the evaluation of word clusters carried out through crowd-sourcing techniques for the Maltese language. The hybrid (Semitic-Romance) nature of Maltese morphology, together with the fact that no large-scale lexical resources are available for Maltese, make this an interesting and challenging problem.peer-reviewe
    corecore