Search CORE

43 research outputs found

Corpus-based typology: Applications, challenges and some solutions

Author: Levshina N.
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 25/05/2022
Field of study

Over the last few years, the number of corpora that can be used for language comparison has dramatically increased. The corpora are so diverse in their structure, size and annotation style, that a novice might not know where to start. The present paper charts this new and changing territory, providing a few landmarks, warning signs and safe paths. Although no corpora corpus at present can replace the traditional type of typological data based on language description in reference grammars, they corpora can help with diverse tasks, being particularly well suited for investigating probabilistic and gradient properties of languages and for discovering and interpreting cross-linguistic generalizations based on processing and communicative mechanisms. At the same time, the use of corpora for typological purposes has not only advantages and opportunities, but also numerous challenges. This paper also contains an empirical case study addressing two pertinent problems: the role of text types in language comparison and the problem of the word as a comparative concept

Partitive Determiners, Partitive Pronouns and Partitive Case

Author
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 11/01/2022
Field of study

The fine-grained morpho-syntactic and semantic variation displayed by partitive elements across European languages is far from being well-described, let alone well-understood. This volume focuses on Partitive Determiners, Partitive Pronouns and Partitive Case in European languages, their emergence and spread in diachrony, their acquisition by L2 speakers, and their syntax and interpretation in a cross-theoretical typological perspective

Partitive Determiners, Partitive Pronouns and Partitive Case

Author
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2021
Field of study

Low-Resource Unsupervised NMT:Diagnosing the Problem and Providing a Linguistically Motivated Solution

Author: Edman Lukas
Noord van, Gertjan
Toral Ruiz Antonio
Publication venue
Publication date: 01/01/2020
Field of study

Unsupervised Machine Translation hasbeen advancing our ability to translatewithout parallel data, but state-of-the-artmethods assume an abundance of mono-lingual data. This paper investigates thescenario where monolingual data is lim-ited as well, finding that current unsuper-vised methods suffer in performance un-der this stricter setting. We find that theperformance loss originates from the poorquality of the pretrained monolingual em-beddings, and we propose using linguis-tic information in the embedding train-ing scheme. To support this, we look attwo linguistic features that may help im-prove alignment quality: dependency in-formation and sub-word information. Us-ing dependency-based embeddings resultsin a complementary word representationwhich offers a boost in performance ofaround 1.5 BLEU points compared to stan-dardWORD2VECwhen monolingual datais limited to 1 million sentences per lan-guage. We also find that the inclusion ofsub-word information is crucial to improv-ing the quality of the embedding

Proceedings - University of Groningen

Dissertations of the University of Groningen

Forgotten Laxdæla poetry : a study and an edition of Tyrfingur Finnsson's Vísur uppá Laxdæla sögu

Author: Sverdlov Ilya
Vanherpen Sofie
Publication venue: Helsingin Yliopisto Folkloristiika
Publication date: 01/01/2017
Field of study

The paper discusses the metre and the diction of a previously unpublished small poem about characters of Laxdæla saga, composed in 18th century. The stanzas are ostensibly in skaldic dróttkvætt; the analysis shows it to be an imitation of the classical metre, yet a remarkably successful one, implying an extraordinarily good grasp of dróttkvætt poetics on the part of a poet composing several hundred years after the end of the classical dróttkvætt period

Archivsystem Ask23

CLARIN

Author
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 30/01/2023
Field of study

The book provides a comprehensive overview of the Common Language Resources and Technology Infrastructure – CLARIN – for the humanities. It covers a broad range of CLARIN language resources and services, its underlying technological infrastructure, the achievements of national consortia, and challenges that CLARIN will tackle in the future. The book is published 10 years after establishing CLARIN as an Europ. Research Infrastructure Consortium