Search CORE

3 research outputs found

Designing CzeDLex -A Lexicon of Czech Discourse Connectives

Author: Mirovsky Jiri
Polakova Lucie
Rysova Magdalena
Synkova Pavlina
Publication venue: Hankookmunhwasa
Publication date: 01/01/2016
Field of study

Using a discourse bank and a lexicon for the automatic identification of discourse connectives

Author: A Branco
D Marcu
D Zeyrek
E Bick
M Halliday
M Stede
MJ Cuenca
Z Lin
Publication venue
Publication date: 01/01/2018
Field of study

We describe two new resources that have been prepared for European Portuguese and how they are used for discourse parsing: the Portuguese subpart of the TED-MDB corpus, a multilingual corpus of TED Talks that has been annotated in the PDTB style, and the Lexicon of Discourse Markers for Portuguese (LDM-PT). Both lexicon and corpus are used in a preliminary experiment for discourse connective identification in texts. This includes, in many cases, the difficult task of disambiguating between connective and non-connective uses. We annotated the PT-TED-MDB corpus with POS, lemma and syntactic constituency and focus on the 10 most frequent connectives in the corpus. The best approach considers word-form+POS+syntactic annotation and leads to 85% precision.info:eu-repo/semantics/publishedVersio

Universidade de Lisboa: Repositório.UL