5 research outputs found

    Linking Verb Pattern Dictionaries of English and Spanish

    Get PDF
    The paper presents the first step in the creation of a new multilingual and corpus-driven lexical resource by means of linking existing monolingual pattern dictionaries of English and Spanish verbs. The two dictionaries were compiled through Corpus Pattern Analysis (CPA) – an empirical procedure in corpus linguistics that associates word meaning with word use by means of analysis of phraseological patterns and collocations found in corpus data. This paper provides a first look into a number of practical issues arising from the task of linking corresponding patterns across languages via both manual and automatic procedures. In order to facilitate manual pattern linking, we implemented a heuristic-based algorithm to generate automatic suggestions for candidate verb pattern pairs, which obtained 80% precision. Our goal is to kick-start the development of a new resource for verbs that can be used by language learners, translators, editors and the research community alike

    Cross-lingual Dependency Parsing of Related Languages with Rich Morphosyntactic Tagsets

    Get PDF
    This paper addresses cross-lingual dependency parsing using rich morphosyntactic tagsets. In our case study, we experiment with three related Slavic languages: Croatian, Serbian and Slovene. Four different dependency treebanks are used for monolingual parsing, direct cross-lingual parsing, and a recently introduced crosslingual parsing approach that utilizes statistical machine translation and annotation projection. We argue for the benefits of using rich morphosyntactic tagsets in cross-lingual parsing and empirically support the claim by showing large improvements over an impoverished common feature representation in form of a reduced part-of-speech tagset. In the process, we improve over the previous state-of-the-art scores in dependency parsing for all three languages.Published versio

    The Role of Corpus Pattern Analysis in Machine Translation Evaluation

    Get PDF
    This paper takes a preliminary look at the relation between verb pattern matches in the Pattern Dictionary of English Verbs (PDEV) and translation quality through a qualitative analysis of human-ranked sentences from 5 different machine translation systems. The purpose of the analysis is not only to determine whether verbs in the automatic translations and their immediate contexts match any pre-existing semanto-syntactic pattern in PDEV, but also to establish links between hypothesis sentences and the verbs in the reference translation. It attempts to answer the question of whether or not the semantic and syntactic information captured by Corpus Pattern Analysis (CPA) can indicate whether a sentence is a “good” translation. Two human annotators manually identified the occurrence of patterns in 50 translations and indicated whether these patterns match any identified pattern in the corresponding reference translation. Results indicate that CPA can be used to distinguish between well and ill-formed sentences
    corecore