Search CORE

2 research outputs found

Multilingual Dependency Parsing of Uralic Languages : Parsing with zero-shot transfer and cross-lingual models using geographically proximate, genealogically related, and syntactically similar transfer languages

Author: Erenmalm Elsa
Publication venue: Uppsala universitet, Institutionen för lingvistik och filologi
Publication date: 01/01/2020
Field of study

One way to improve dependency parsing scores for low-resource languages is to make use of existing resources from other closely related or otherwise similar languages. In this paper, we look at eleven Uralic target languages (Estonian, Finnish, Hungarian, Karelian, Livvi, Komi Zyrian, Komi Permyak, Moksha, Erzya, North Sámi, and Skolt Sámi) with treebanks of varying sizes and select transfer languages based on geographical, genealogical, and syntactic distances. We focus primarily on the performance of parser models trained on various combinations of geographically proximate and genealogically related transfer languages, in target-trained, zero-shot, and cross-lingual configurations. We find that models trained on combinations of geographically proximate and genealogically related transfer languages reach the highest LAS in most zero-shot models, while our highest-performing cross-lingual models were trained on genealogically related languages. We also find that cross-lingual models outperform zero-shot transfer models. We then select syntactically similar transfer languages for three target languages, and find a slight improvement in the case of Hungarian. We discuss the results and conclude with suggestions for possible future work

Publikationer från Uppsala Universitet

Digitala Vetenskapliga Arkivet - Academic Archive On-line

PARSEME corpora annotated for verbal multiword expressions (version 1.3)

Author: Aceta Cristina
Adalı Kübra
Aduriz Itziar
Antić Anđela
Antoine Jean-Yves
Arhar Holdt Špela
Attard Greta
Azzopardi Kirsty
Barbu Mititelu Verginica
Bejček Eduard
Ben Khelil Chérifa
Berk Gözde
Bhatia Archna
Bielinskienė Agnė
Blagus Goranka
Boizou Loïc
Bonial Claire
Bonnici Janice
Boz Mert
Buljan Maja
Busuttil Jael
Butler Alexandra
Bărbulescu Elena-Andreea
Candito Marie
Cap Fabienne
Carlino Carola
Caruso Valeria
Chen Jia
Cherchi Manuela
Constant Matthieu
Cook Paul
Cordeiro Silvio Ricardo
Cristescu Mihaela
de Medeiros Caseli Helena
De Santis Anna
Di Buono Maria Pia
Diab Mona
Dimitrova Tsvetana
Dinç Tutkum
Ehren Rafael
El Maarouf Ismail
Elbadrashiny Mohamed
Elyovich Hevi
Erden Berna
Erenmalm Elsa
Eryiğit Gülşen
Estarrona Ainara
Fabri Ray
Farrugia Alison
Findlay Jamie
Finnveden Gustav
Foster Jennifer
Fotopoulou Aggeliki
Foufi Vassiliki
Galea Luke
Galea Sara Anne
Gantar Polona
Gatt Albert
Gatt Anabelle
Ge Xiaomin
Giouli Voula
Gonzalez Itziar
Griciūtė Bernadeta
Guillaume Bruno
Gurrutxaga Antton
Güngör Tunga
Ha-Cohen Kerner Yaakov
Hadj Mohamed Najet
Hawwari Abdelati
Herrero Carlos
Hu Fangyuan
Hu Sha
Ibrahim Rehab
Iñurrieta Uxoa
Jagfeld Glorianna
Jain Kanishka
Jaknić Isidora
Jazbec Ivo-Pavao
Jiang Menghan
Kavčič Teja
Kovalevskaitė Jolanta
Kovács Viktória
Krek Simon
Krstev Cvetana
Kuzman Taja
Leseva Svetlozara
Li Minli
Lichte Timm
Liebeskind Chaya
Lindqvist Ellinor
Liu Siyuan
Ljubešić Nikola
Louizou Sevasti
Lynn Teresa
Maldonado Alfredo
Malka Ruth
Markantonatou Stella
Martínez Alonso Héctor
Matas Ivana
McCrae John
Miral Ayşenur
Miranda Isaac
Monti Johanna
Muscat Amanda
Nivre Joakim
Onofrei Mihaela
Palka-Binkiewicz Emilia
Papadelli Stella
Parmentier Yannick
Parra Escartín Carla
Pascucci Antonio
Pasquer Caroline
Petterson Eva
Pickard Thomas
Priego Sanchez Belem
Puri Vandana
QasemiZadeh Behrang
Qin Zhenzhen
Rademaker Alexandre
Raffone Annalisa
Ramisch Carlos
Ramisch Renata
Ratori Shraddha
Riccio Anna
Rimkute Erika
Rizea Monica-Mihaela
Sangati Federico
Sarlak Mahtab
Savary Agata
Schneider Nathan
Shamsfard Mehrnoush
Shukla Vishakha
Simkó Katalin
Somers Clarissa
Spagnol Michael
Speranza Giulia
Srivastava Shubham
Stank
Stanković Ranka
Stefanova Valentina
Stoyanova Ivelina
Stymne Sara
Sun Ruilong
Tabone Nicole
Tajalli Vahide
Tanti Marc
Taslimipoor Shiva
Theoxari Natasa
Todorova Maria
Urešová Zdeňka
Uria Larraitz
Urizar Ruben
Vaidya Ashwini
Vale Oto
van der Plas Lonneke
Villavicencio Aline
Vincze Veronika
Walles Rinat
Walsh Abigail
Wang Chenweng
Waszczuk Jakub
Wick Pedro Gabriela
Wilkens Rodrigo
Xiao Huangyang
Xu Hongzhi
Yan Peiyi
Yarandi Yalda
Yih Tsy
Yirmibeşoğlu Zeynep
Yu Ke
Yu Songping
Zeng Si
Zgreabăn Bianca-Mădălina
Zhang Yongchen
Zhao Yun
Zilio Leonardo
Öztürk Yağmur
Šnajder Jan
Publication venue: PARSEME
Publication date: 10/05/2023
Field of study

This multilingual resource contains corpora in which verbal MWEs have been manually annotated. VMWEs include idioms (let the cat out of the bag), light-verb constructions (make a decision), verb-particle constructions (give up), inherently reflexive verbs (help oneself), and multi-verb constructions (make do). This is the first release of the corpora without an associated shared task. Previous version (1.2) was associated with the PARSEME Shared Task on semi-supervised Identification of Verbal MWEs (2020). The data covers 26 languages corresponding to the combination of the corpora for all previous three editions (1.0, 1.1 and 1.2) of the corpora. VMWEs were annotated according to the universal guidelines. The corpora are provided in the cupt format, inspired by the CONLL-U format. Morphological and syntactic information, including parts of speech, lemmas, morphological features and/or syntactic dependencies, are also provided. Depending on the language, the information comes from treebanks (e.g., Universal Dependencies) or from automatic parsers trained on treebanks (e.g., UDPipe). All corpora are split into training, development and test data, following the splitting strategy adopted for the PARSEME Shared Task 1.2. The annotation guidelines are available online: https://parsemefr.lis-lab.fr/parseme-st-guidelines/1.3 The .cupt format is detailed here: https://multiword.sourceforge.net/cupt-format

LINDAT/CLARIN digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University