1 research outputs found

    Aligning Tagged Bitexts

    No full text
    This paper describes how complementary techniques can be employed to align multiword expressions in a parallel corpus or bitext. The bitext used for experimentation has two main features: (i) it contains bilingual documents from a dedicated domain of legal and administrative publications rich in specialized jar- gon; (ii) it involves two languages, Spanish and Basque, which are typologically very distinct (both lexically and morpho-syntactically). The former feature provides a good basis for testing techniques of collocation detection. The latter presents quite a challange to a number of reported algorithms, in particular to the alignment of sentence internal segments
    corecore